Brain2Music: Reconstructing Music from Human Brain Activity
- URL: http://arxiv.org/abs/2307.11078v1
- Date: Thu, 20 Jul 2023 17:55:17 GMT
- Title: Brain2Music: Reconstructing Music from Human Brain Activity
- Authors: Timo I. Denk, Yu Takagi, Takuya Matsuyama, Andrea Agostinelli, Tomoya
Nakai, Christian Frank, Shinji Nishimoto
- Abstract summary: We introduce a method for reconstructing music from brain activity, captured using functional magnetic resonance imaging (fMRI)
Our approach uses either music retrieval or the MusicLM music generation model conditioned on embeddings derived from fMRI data.
The generated music resembles the musical stimuli that human subjects experienced, with respect to semantic properties like genre, instrumentation, and mood.
- Score: 1.4777718769290527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The process of reconstructing experiences from human brain activity offers a
unique lens into how the brain interprets and represents the world. In this
paper, we introduce a method for reconstructing music from brain activity,
captured using functional magnetic resonance imaging (fMRI). Our approach uses
either music retrieval or the MusicLM music generation model conditioned on
embeddings derived from fMRI data. The generated music resembles the musical
stimuli that human subjects experienced, with respect to semantic properties
like genre, instrumentation, and mood. We investigate the relationship between
different components of MusicLM and brain activity through a voxel-wise
encoding modeling analysis. Furthermore, we discuss which brain regions
represent information derived from purely textual descriptions of music
stimuli. We provide supplementary material including examples of the
reconstructed music at https://google-research.github.io/seanet/brain2music
Related papers
- Mode-conditioned music learning and composition: a spiking neural network inspired by neuroscience and psychology [5.2419221159594676]
We propose a spiking neural network inspired by brain mechanisms and psychological theories to represent musical modes and keys.
Our research aims to create a system that not only learns and generates music but also bridges the gap between human cognition and artificial intelligence.
arXiv Detail & Related papers (2024-11-22T07:29:26Z) - A Survey of Foundation Models for Music Understanding [60.83532699497597]
This work is one of the early reviews of the intersection of AI techniques and music understanding.
We investigated, analyzed, and tested recent large-scale music foundation models in respect of their music comprehension abilities.
arXiv Detail & Related papers (2024-09-15T03:34:14Z) - R&B -- Rhythm and Brain: Cross-subject Decoding of Music from Human Brain Activity [0.12289361708127873]
Music is a universal phenomenon that profoundly influences human experiences across cultures.
This study investigates whether music can be decoded from human brain activity measured with functional MRI (fMRI) during its perception.
arXiv Detail & Related papers (2024-06-21T17:11:45Z) - MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z) - Brain3D: Generating 3D Objects from fMRI [76.41771117405973]
We design a novel 3D object representation learning method, Brain3D, that takes as input the fMRI data of a subject.
We show that our model captures the distinct functionalities of each region of human vision system.
Preliminary evaluations indicate that Brain3D can successfully identify the disordered brain regions in simulated scenarios.
arXiv Detail & Related papers (2024-05-24T06:06:11Z) - Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity [60.983327742457995]
Reconstructing the viewed images from human brain activity bridges human and computer vision through the Brain-Computer Interface.
We devise Psychometry, an omnifit model for reconstructing images from functional Magnetic Resonance Imaging (fMRI) obtained from different subjects.
arXiv Detail & Related papers (2024-03-29T07:16:34Z) - Brainformer: Mimic Human Visual Brain Functions to Machine Vision Models via fMRI [12.203617776046169]
We introduce a novel framework named Brainformer to analyze fMRI patterns in the human perception system.
This work introduces a prospective approach to transferring knowledge from human perception to neural networks.
arXiv Detail & Related papers (2023-11-30T22:39:23Z) - UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion
Model from Human Brain Activity [2.666777614876322]
We propose UniBrain: Unify Image Reconstruction and Captioning All in One Diffusion Model from Human Brain Activity.
We transform fMRI voxels into text and image latent for low-level information to generate realistic captions and images.
UniBrain outperforms current methods both qualitatively and quantitatively in terms of image reconstruction and reports image captioning results for the first time on the Natural Scenes dataset.
arXiv Detail & Related papers (2023-08-14T19:49:29Z) - Brain Captioning: Decoding human brain activity into images and text [1.5486926490986461]
We present an innovative method for decoding brain activity into meaningful images and captions.
Our approach takes advantage of cutting-edge image captioning models and incorporates a unique image reconstruction pipeline.
We evaluate our methods using quantitative metrics for both generated captions and images.
arXiv Detail & Related papers (2023-05-19T09:57:19Z) - Joint fMRI Decoding and Encoding with Latent Embedding Alignment [77.66508125297754]
We introduce a unified framework that addresses both fMRI decoding and encoding.
Our model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.
arXiv Detail & Related papers (2023-03-26T14:14:58Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.