Knowledge-based Multimodal Music Similarity
- URL: http://arxiv.org/abs/2306.12249v1
- Date: Wed, 21 Jun 2023 13:12:12 GMT
- Title: Knowledge-based Multimodal Music Similarity
- Authors: Andrea Poltronieri
- Abstract summary: This research focuses on the study of musical similarity using both symbolic and audio content.
The aim of this research is to develop a fully explainable and interpretable system that can provide end-users with more control and understanding of music similarity and classification systems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Music similarity is an essential aspect of music retrieval, recommendation
systems, and music analysis. Moreover, similarity is of vital interest for
music experts, as it allows studying analogies and influences among composers
and historical periods. Current approaches to musical similarity rely mainly on
symbolic content, which can be expensive to produce and is not always readily
available. Conversely, approaches using audio signals typically fail to provide
any insight about the reasons behind the observed similarity. This research
addresses the limitations of current approaches by focusing on the study of
musical similarity using both symbolic and audio content. The aim of this
research is to develop a fully explainable and interpretable system that can
provide end-users with more control and understanding of music similarity and
classification systems.
Related papers
- A Survey of Foundation Models for Music Understanding [60.83532699497597]
This work is one of the early reviews of the intersection of AI techniques and music understanding.
We investigated, analyzed, and tested recent large-scale music foundation models in respect of their music comprehension abilities.
arXiv Detail & Related papers (2024-09-15T03:34:14Z) - A Dataset and Baselines for Measuring and Predicting the Music Piece Memorability [16.18336216092687]
We focus on measuring and predicting music memorability.
We train baselines to predict and analyze music memorability.
We demonstrate that while there is room for improvement, predicting music memorability with limited data is possible.
arXiv Detail & Related papers (2024-05-21T14:57:04Z) - Quantifying the evolution of harmony and novelty in western classical
music [1.0152838128195467]
We present a study of musical features related to harmony, and we document how they evolved over 400 years in western classical music.
We develop measures to quantify key uncertainty, and diversity and novelty in key transitions.
We report a decline in innovation in harmonic transitions in the early classical period followed by a steep increase in the late classical.
arXiv Detail & Related papers (2023-08-06T23:00:34Z) - Multimodal Lyrics-Rhythm Matching [0.0]
We propose a novel multimodal lyrics-rhythm matching approach that specifically matches key components of lyrics and music with each other.
We use audio instead of sheet music with readily available metadata, which creates more challenges yet increases the application flexibility of our method.
Our experimental results reveal an 0.81 probability of matching on average, and around 30% of the songs have a probability of 0.9 or higher of keywords landing on strong beats.
arXiv Detail & Related papers (2023-01-06T22:24:53Z) - ALCAP: Alignment-Augmented Music Captioner [34.85003676798762]
We introduce a method to learn multimodal alignment between audio and lyrics through contrastive learning.
This not only recognizes and emphasizes the synergy between audio and lyrics but also paves the way for models to achieve deeper cross-modal coherence.
arXiv Detail & Related papers (2022-12-21T10:20:54Z) - A Dataset for Greek Traditional and Folk Music: Lyra [69.07390994897443]
This paper presents a dataset for Greek Traditional and Folk music that includes 1570 pieces, summing in around 80 hours of data.
The dataset incorporates YouTube timestamped links for retrieving audio and video, along with rich metadata information with regards to instrumentation, geography and genre.
arXiv Detail & Related papers (2022-11-21T14:15:43Z) - Concept-Based Techniques for "Musicologist-friendly" Explanations in a
Deep Music Classifier [5.442298461804281]
We focus on more human-friendly explanations based on high-level musical concepts.
Our research targets trained systems (post-hoc explanations) and explores two approaches.
We demonstrate both techniques on an existing symbolic composer classification system, showcase their potential, and highlight their intrinsic limitations.
arXiv Detail & Related papers (2022-08-26T07:45:29Z) - SeCo: Separating Unknown Musical Visual Sounds with Consistency Guidance [88.0355290619761]
This work focuses on the separation of unknown musical instruments.
We propose the Separation-with-Consistency (SeCo) framework, which can accomplish the separation on unknown categories.
Our framework exhibits strong adaptation ability on the novel musical categories and outperforms the baseline methods by a significant margin.
arXiv Detail & Related papers (2022-03-25T09:42:11Z) - Contrastive Learning with Positive-Negative Frame Mask for Music
Representation [91.44187939465948]
This paper proposes a novel Positive-nEgative frame mask for Music Representation based on the contrastive learning framework, abbreviated as PEMR.
We devise a novel contrastive learning objective to accommodate both self-augmented positives/negatives sampled from the same music.
arXiv Detail & Related papers (2022-03-17T07:11:42Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.