Deep Music Information Dynamics
- URL: http://arxiv.org/abs/2102.01133v1
- Date: Mon, 1 Feb 2021 19:59:59 GMT
- Title: Deep Music Information Dynamics
- Authors: Shlomo Dubnov
- Abstract summary: We introduce a novel framework that combines two parallel streams - a low rate latent representation stream and a higher rate information dynamics derived from the musical data itself.
Motivated by rate-distortion theories of human cognition we propose a framework for exploring possible relations between imaginary anticipations existing in the listener's mind and information dynamics of the musical surface itself.
- Score: 1.6143012623830792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Music comprises of a set of complex simultaneous events organized in time. In
this paper we introduce a novel framework that we call Deep Musical Information
Dynamics, which combines two parallel streams - a low rate latent
representation stream that is assumed to capture the dynamics of a thought
process contrasted with a higher rate information dynamics derived from the
musical data itself. Motivated by rate-distortion theories of human cognition
we propose a framework for exploring possible relations between imaginary
anticipations existing in the listener's mind and information dynamics of the
musical surface itself. This model is demonstrated for the case of symbolic
(MIDI) data, as accounting for acoustic surface would require many more layers
to capture instrument properties and performance expressive inflections. The
mathematical framework is based on variational encoding that first establishes
a high rate representation of the musical observations, which is then reduced
using a bit-allocation method into a parallel low rate data stream. The
combined loss considered here includes both the information rate in terms of
time evolution for each stream, and the fidelity of encoding measured in terms
of mutual information between the high and low rate representations. In the
simulations presented in the paper we are able to juxtapose aspects of
latent/imaginary surprisal versus surprisal of the music surface in a manner
that is quantifiable and computationally tractable. The set of computational
tools is discussed in the paper, suggesting that a trade off between
compression and prediction are an important factor in the analysis and design
of time-based music generative models.
Related papers
- TimeGraphs: Graph-based Temporal Reasoning [64.18083371645956]
TimeGraphs is a novel approach that characterizes dynamic interactions as a hierarchical temporal graph.
Our approach models the interactions using a compact graph-based representation, enabling adaptive reasoning across diverse time scales.
We evaluate TimeGraphs on multiple datasets with complex, dynamic agent interactions, including a football simulator, the Resistance game, and the MOMA human activity dataset.
arXiv Detail & Related papers (2024-01-06T06:26:49Z) - Perceptual Musical Features for Interpretable Audio Tagging [2.1730712607705485]
This study explores the relevance of interpretability in the context of automatic music tagging.
We constructed a workflow that incorporates three different information extraction techniques.
We conducted experiments on two datasets, namely the MTG-Jamendo dataset and the GTZAN dataset.
arXiv Detail & Related papers (2023-12-18T14:31:58Z) - Graph-based Polyphonic Multitrack Music Generation [9.701208207491879]
This paper introduces a novel graph representation for music and a deep Variational Autoencoder that generates the structure and the content of musical graphs separately.
By separating the structure and content of musical graphs, it is possible to condition generation by specifying which instruments are played at certain times.
arXiv Detail & Related papers (2023-07-27T15:18:50Z) - DynamicStereo: Consistent Dynamic Depth from Stereo Videos [91.1804971397608]
We propose DynamicStereo to estimate disparity for stereo videos.
The network learns to pool information from neighboring frames to improve the temporal consistency of its predictions.
We also introduce Dynamic Replica, a new benchmark dataset containing synthetic videos of people and animals in scanned environments.
arXiv Detail & Related papers (2023-05-03T17:40:49Z) - Relating Human Perception of Musicality to Prediction in a Predictive
Coding Model [0.8062120534124607]
We explore the use of a neural network inspired by predictive coding for modeling human music perception.
This network was developed based on the computational neuroscience theory of recurrent interactions in the hierarchical visual cortex.
We adapt this network to model the hierarchical auditory system and investigate whether it will make similar choices to humans regarding the musicality of a set of random pitch sequences.
arXiv Detail & Related papers (2022-10-29T12:20:01Z) - Hybrid Predictive Coding: Inferring, Fast and Slow [62.997667081978825]
We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner.
We demonstrate that our model is inherently sensitive to its uncertainty and adaptively balances balances to obtain accurate beliefs using minimum computational expense.
arXiv Detail & Related papers (2022-04-05T12:52:45Z) - CCVS: Context-aware Controllable Video Synthesis [95.22008742695772]
presentation introduces a self-supervised learning approach to the synthesis of new video clips from old ones.
It conditions the synthesis process on contextual information for temporal continuity and ancillary information for fine control.
arXiv Detail & Related papers (2021-07-16T17:57:44Z) - Sequence Generation using Deep Recurrent Networks and Embeddings: A
study case in music [69.2737664640826]
This paper evaluates different types of memory mechanisms (memory cells) and analyses their performance in the field of music composition.
A set of quantitative metrics is presented to evaluate the performance of the proposed architecture automatically.
arXiv Detail & Related papers (2020-12-02T14:19:19Z) - Score-informed Networks for Music Performance Assessment [64.12728872707446]
Deep neural network-based methods incorporating score information into MPA models have not yet been investigated.
We introduce three different models capable of score-informed performance assessment.
arXiv Detail & Related papers (2020-08-01T07:46:24Z) - Learning Style-Aware Symbolic Music Representations by Adversarial
Autoencoders [9.923470453197657]
We focus on leveraging adversarial regularization as a flexible and natural mean to imbue variational autoencoders with context information.
We introduce the first Music Adversarial Autoencoder (MusAE)
Our model has a higher reconstruction accuracy than state-of-the-art models based on standard variational autoencoders.
arXiv Detail & Related papers (2020-01-15T18:07:20Z) - Modeling Musical Structure with Artificial Neural Networks [0.0]
I explore the application of artificial neural networks to different aspects of musical structure modeling.
I show how a connectionist model, the Gated Autoencoder (GAE), can be employed to learn transformations between musical fragments.
I propose a special predictive training of the GAE, which yields a representation of polyphonic music as a sequence of intervals.
arXiv Detail & Related papers (2020-01-06T18:35:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.