Video-to-Music Recommendation using Temporal Alignment of Segments
- URL: http://arxiv.org/abs/2306.07187v1
- Date: Mon, 12 Jun 2023 15:40:31 GMT
- Title: Video-to-Music Recommendation using Temporal Alignment of Segments
- Authors: Laure Pr\'etet, Ga\"el Richard, Cl\'ement Souchier, Geoffroy Peeters
- Abstract summary: We study cross-modal recommendation of music tracks to be used as soundtracks for videos.
We build on a self-supervised system that learns a content association between music and video.
We propose a novel approach to significantly improve the system's performance using structure-aware recommendation.
- Score: 5.7235653928654235
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study cross-modal recommendation of music tracks to be used as soundtracks
for videos. This problem is known as the music supervision task. We build on a
self-supervised system that learns a content association between music and
video. In addition to the adequacy of content, adequacy of structure is crucial
in music supervision to obtain relevant recommendations. We propose a novel
approach to significantly improve the system's performance using
structure-aware recommendation. The core idea is to consider not only the full
audio-video clips, but rather shorter segments for training and inference. We
find that using semantic segments and ranking the tracks according to sequence
alignment costs significantly improves the results. We investigate the impact
of different ranking metrics and segmentation methods.
Related papers
- MuPT: A Generative Symbolic Music Pretrained Transformer [56.09299510129221]
We explore the application of Large Language Models (LLMs) to the pre-training of music.
To address the challenges associated with misaligned measures from different tracks during generation, we propose a Synchronized Multi-Track ABC Notation (SMT-ABC Notation)
Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set.
arXiv Detail & Related papers (2024-04-09T15:35:52Z) - MusicRL: Aligning Music Generation to Human Preferences [62.44903326718772]
MusicRL is the first music generation system finetuned from human feedback.
We deploy MusicLM to users and collect a substantial dataset comprising 300,000 pairwise preferences.
We train MusicRL-U, the first text-to-music model that incorporates human feedback at scale.
arXiv Detail & Related papers (2024-02-06T18:36:52Z) - Leveraging Negative Signals with Self-Attention for Sequential Music
Recommendation [0.27195102129094995]
We propose a contrastive learning task to incorporate negative feedback to promote positive hits and penalize negative hits.
Our experiments show that this results in consistent performance gains over the baseline architectures ignoring negative user feedback.
arXiv Detail & Related papers (2023-09-20T20:21:13Z) - Fairness Through Domain Awareness: Mitigating Popularity Bias For Music
Discovery [56.77435520571752]
We explore the intrinsic relationship between music discovery and popularity bias.
We propose a domain-aware, individual fairness-based approach which addresses popularity bias in graph neural network (GNNs) based recommender systems.
Our approach uses individual fairness to reflect a ground truth listening experience, i.e., if two songs sound similar, this similarity should be reflected in their representations.
arXiv Detail & Related papers (2023-08-28T14:12:25Z) - It's Time for Artistic Correspondence in Music and Video [32.31962546363909]
We present an approach for recommending a music track for a given video, and vice versa, based on both their temporal alignment and their correspondence at an artistic level.
We propose a self-supervised approach that learns this correspondence directly from data, without any need of human annotations.
Experiments show that this approach strongly outperforms alternatives that do not exploit the temporal context.
arXiv Detail & Related papers (2022-06-14T20:21:04Z) - Explainability in Music Recommender Systems [69.0506502017444]
We discuss how explainability can be addressed in the context of Music Recommender Systems (MRSs)
MRSs are often quite complex and optimized for recommendation accuracy.
We show how explainability components can be integrated within a MRS and in what form explanations can be provided.
arXiv Detail & Related papers (2022-01-25T18:32:11Z) - Unsupervised Learning of Deep Features for Music Segmentation [8.528384027684192]
Music segmentation is a problem of identifying boundaries between, and labeling, distinct music segments.
The performance of a range of music segmentation algorithms has been dependent on the audio features chosen to represent the audio.
In this work, unsupervised training of deep feature embeddings using convolutional neural networks (CNNs) is explored for music segmentation.
arXiv Detail & Related papers (2021-08-30T01:55:44Z) - Lets Play Music: Audio-driven Performance Video Generation [58.77609661515749]
We propose a new task named Audio-driven Per-formance Video Generation (APVG)
APVG aims to synthesize the video of a person playing a certain instrument guided by a given music audio clip.
arXiv Detail & Related papers (2020-11-05T03:13:46Z) - Learning to rank music tracks using triplet loss [6.43271391521664]
We propose a method for direct recommendation based on the audio content without explicitly tagging the music tracks.
We train a Convolutional Neural Network to learn the similarity via triplet loss.
Results highlight the efficiency of our system, especially when associated with an Auto-pooling layer.
arXiv Detail & Related papers (2020-05-18T08:20:54Z) - Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music.
We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.