Related papers: Expectation and Acoustic Neural Network Representations Enhance Music Identification from Brain Activity

Expectation and Acoustic Neural Network Representations Enhance Music Identification from Brain Activity

URL: http://arxiv.org/abs/2603.03190v1
Date: Tue, 03 Mar 2026 17:47:09 GMT
Title: Expectation and Acoustic Neural Network Representations Enhance Music Identification from Brain Activity
Authors: Shogo Noguchi, Taketo Akama, Tai Nakamura, Shun Minamikawa, Natalia Polouliakh,
Abstract summary: We show that distinguishing acoustic and expectation-related ANN representations as teacher targets improves EEG-based music identification.<n>This work points toward advances in predictive music cognition and neural decoding.
Score: 2.9095985849532884
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: During music listening, cortical activity encodes both acoustic and expectation-related information. Prior work has shown that ANN representations resemble cortical representations and can serve as supervisory signals for EEG recognition. Here we show that distinguishing acoustic and expectation-related ANN representations as teacher targets improves EEG-based music identification. Models pretrained to predict either representation outperform non-pretrained baselines, and combining them yields complementary gains that exceed strong seed ensembles formed by varying random initializations. These findings show that teacher representation type shapes downstream performance and that representation learning can be guided by neural encoding. This work points toward advances in predictive music cognition and neural decoding. Our expectation representation, computed directly from raw signals without manual labels, reflects predictive structure beyond onset or pitch, enabling investigation of multilayer predictive encoding across diverse stimuli. Its scalability to large, diverse datasets further suggests potential for developing general-purpose EEG models grounded in cortical encoding principles.

Related papers

Learning Robust Spatial Representations from Binaural Audio through Feature Distillation [64.36563387033921]
We investigate the use of a pretraining stage based on feature distillation to learn a robust spatial representation of speech without the need for data labels.<n>Our experiments demonstrate that the pretrained models show improved performance in noisy and reverberant environments.
arXiv Detail & Related papers (2025-08-28T15:43:15Z)
Predicting Artificial Neural Network Representations to Learn Recognition Model for Music Identification from Brain Recordings [1.7021860383953338]
Recent studies have demonstrated that the representations of artificial neural networks (ANNs) can exhibit notable similarities to cortical representations.<n>This study introduces a novel approach to developing recognition models for brain recordings in response to external auditory stimuli.<n>It holds promise for advancing brain-computer interfaces (BCI), neural decoding techniques, and our understanding of music cognition.
arXiv Detail & Related papers (2024-12-20T04:37:26Z)
CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information [61.1904164368732]
We propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals.<n>Specifically, CognitionCapturer trains Modality Experts for each modality to extract cross-modal information from the EEG modality.<n>The framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities.
arXiv Detail & Related papers (2024-12-13T16:27:54Z)
TokenUnify: Scaling Up Autoregressive Pretraining for Neuron Segmentation [65.65530016765615]
We propose a hierarchical predictive coding framework that captures multi-scale dependencies through three complementary learning objectives.<n> TokenUnify integrates random token prediction, next-token prediction, and next-all token prediction to create a comprehensive representational space.<n>We also introduce a large-scale EM dataset with 1.2 billion annotated voxels, offering ideal long-sequence visual data with spatial continuity.
arXiv Detail & Related papers (2024-05-27T05:45:51Z)
Music Emotion Prediction Using Recurrent Neural Networks [8.867897390286815]
This study aims to enhance music recommendation systems and support therapeutic interventions by tailoring music to fit listeners' emotional states. We utilize Russell's Emotion Quadrant to categorize music into four distinct emotional regions and develop models capable of accurately predicting these categories. Our approach involves extracting a comprehensive set of audio features using Librosa and applying various recurrent neural network architectures, including standard RNNs, Bidirectional RNNs, and Long Short-Term Memory (LSTM) networks.
arXiv Detail & Related papers (2024-05-10T18:03:20Z)
Relating Human Perception of Musicality to Prediction in a Predictive Coding Model [0.8062120534124607]
We explore the use of a neural network inspired by predictive coding for modeling human music perception. This network was developed based on the computational neuroscience theory of recurrent interactions in the hierarchical visual cortex. We adapt this network to model the hierarchical auditory system and investigate whether it will make similar choices to humans regarding the musicality of a set of random pitch sequences.
arXiv Detail & Related papers (2022-10-29T12:20:01Z)
Enhancing Affective Representations of Music-Induced EEG through Multimodal Supervision and latent Domain Adaptation [34.726185927120355]
We employ music signals as a supervisory modality to EEG, aiming to project their semantic correspondence onto a common representation space. We utilize a bi-modal framework by combining an LSTM-based attention model to process EEG and a pre-trained model for music tagging, along with a reverse domain discriminator to align the distributions of the two modalities. The resulting framework can be utilized for emotion recognition both directly, by performing supervised predictions from either modality, and indirectly, by providing relevant music samples to EEG input queries.
arXiv Detail & Related papers (2022-02-20T07:32:12Z)
Learning Personal Representations from fMRIby Predicting Neurofeedback Performance [52.77024349608834]
We present a deep neural network method for learning a personal representation for individuals performing a self neuromodulation task, guided by functional MRI (fMRI) The representation is learned by a self-supervised recurrent neural network, that predicts the Amygdala activity in the next fMRI frame given recent fMRI frames and is conditioned on the learned individual representation.
arXiv Detail & Related papers (2021-12-06T10:16:54Z)
EEGminer: Discovering Interpretable Features of Brain Activity with Learnable Filters [72.19032452642728]
We propose a novel differentiable EEG decoding pipeline consisting of learnable filters and a pre-determined feature extraction module. We demonstrate the utility of our model towards emotion recognition from EEG signals on the SEED dataset and on a new EEG dataset of unprecedented size. The discovered features align with previous neuroscience studies and offer new insights, such as marked differences in the functional connectivity profile between left and right temporal areas during music listening.
arXiv Detail & Related papers (2021-10-19T14:22:04Z)
Noisy Agents: Self-supervised Exploration by Predicting Auditory Events [127.82594819117753]
We propose a novel type of intrinsic motivation for Reinforcement Learning (RL) that encourages the agent to understand the causal effect of its actions. We train a neural network to predict the auditory events and use the prediction errors as intrinsic rewards to guide RL exploration. Experimental results on Atari games show that our new intrinsic motivation significantly outperforms several state-of-the-art baselines.
arXiv Detail & Related papers (2020-07-27T17:59:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.