Multimodal Lyrics-Rhythm Matching
- URL: http://arxiv.org/abs/2301.02732v1
- Date: Fri, 6 Jan 2023 22:24:53 GMT
- Title: Multimodal Lyrics-Rhythm Matching
- Authors: Callie C. Liao, Duoduo Liao, Jesse Guessford
- Abstract summary: We propose a novel multimodal lyrics-rhythm matching approach that specifically matches key components of lyrics and music with each other.
We use audio instead of sheet music with readily available metadata, which creates more challenges yet increases the application flexibility of our method.
Our experimental results reveal an 0.81 probability of matching on average, and around 30% of the songs have a probability of 0.9 or higher of keywords landing on strong beats.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the recent increase in research on artificial intelligence for music,
prominent correlations between key components of lyrics and rhythm such as
keywords, stressed syllables, and strong beats are not frequently studied. Ths
is likely due to challenges such as audio misalignment, inaccuracies in
syllabic identification, and most importantly, the need for cross-disciplinary
knowledge. To address this lack of research, we propose a novel multimodal
lyrics-rhythm matching approach in this paper that specifically matches key
components of lyrics and music with each other without any language
limitations. We use audio instead of sheet music with readily available
metadata, which creates more challenges yet increases the application
flexibility of our method. Furthermore, our approach creatively generates
several patterns involving various multimodalities, including music strong
beats, lyrical syllables, auditory changes in a singer's pronunciation, and
especially lyrical keywords, which are utilized for matching key lyrical
elements with key rhythmic elements. This advantageous approach not only
provides a unique way to study auditory lyrics-rhythm correlations including
efficient rhythm-based audio alignment algorithms, but also bridges
computational linguistics with music as well as music cognition. Our
experimental results reveal an 0.81 probability of matching on average, and
around 30% of the songs have a probability of 0.9 or higher of keywords landing
on strong beats, including 12% of the songs with a perfect landing. Also, the
similarity metrics are used to evaluate the correlation between lyrics and
rhythm. It shows that nearly 50% of the songs have 0.70 similarity or higher.
In conclusion, our approach contributes significantly to the lyrics-rhythm
relationship by computationally unveiling insightful correlations.
Related papers
- Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach [49.2787113554916]
Estimating music piece difficulty is important for organizing educational music collections.
Our work employs explainable descriptors for difficulty estimation in symbolic music representations.
Our approach, evaluated in piano repertoire categorized in 9 classes, achieved 41.4% accuracy independently, with a mean squared error (MSE) of 1.7.
arXiv Detail & Related papers (2024-08-01T11:23:42Z) - Joint sentiment analysis of lyrics and audio in music [1.2349562761400057]
In automatic analysis, the actual audio data is usually analyzed, but the lyrics can also play a crucial role in the perception of moods.
We first evaluate various models for sentiment analysis based on lyrics and audio separately. The corresponding approaches already show satisfactory results, but they also exhibit weaknesses.
arXiv Detail & Related papers (2024-05-03T10:42:17Z) - A Computational Analysis of Lyric Similarity Perception [1.1510009152620668]
We conduct a comparative analysis of computational methods for modeling lyric similarity with human perception.
Results indicated that computational models based on similarities between embeddings from pre-trained BERT-based models, the audio from which the lyrics are derived, and phonetic components are indicative of perceptual lyric similarity.
arXiv Detail & Related papers (2024-04-02T22:31:38Z) - Automatic Time Signature Determination for New Scores Using Lyrics for
Latent Rhythmic Structure [0.0]
We propose a novel approach that only uses lyrics as input to automatically generate a fitting time signature for lyrical songs.
In this paper, the best of our experimental results reveal a 97.6% F1 score and a 0.996 Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) score.
arXiv Detail & Related papers (2023-11-27T01:44:02Z) - Knowledge-based Multimodal Music Similarity [0.0]
This research focuses on the study of musical similarity using both symbolic and audio content.
The aim of this research is to develop a fully explainable and interpretable system that can provide end-users with more control and understanding of music similarity and classification systems.
arXiv Detail & Related papers (2023-06-21T13:12:12Z) - Unsupervised Melody-to-Lyric Generation [91.29447272400826]
We propose a method for generating high-quality lyrics without training on any aligned melody-lyric data.
We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints.
Our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines.
arXiv Detail & Related papers (2023-05-30T17:20:25Z) - Unsupervised Melody-Guided Lyrics Generation [84.22469652275714]
We propose to generate pleasantly listenable lyrics without training on melody-lyric aligned data.
We leverage the crucial alignments between melody and lyrics and compile the given melody into constraints to guide the generation process.
arXiv Detail & Related papers (2023-05-12T20:57:20Z) - Museformer: Transformer with Fine- and Coarse-Grained Attention for
Music Generation [138.74751744348274]
We propose Museformer, a Transformer with a novel fine- and coarse-grained attention for music generation.
Specifically, with the fine-grained attention, a token of a specific bar directly attends to all the tokens of the bars that are most relevant to music structures.
With the coarse-grained attention, a token only attends to the summarization of the other bars rather than each token of them so as to reduce the computational cost.
arXiv Detail & Related papers (2022-10-19T07:31:56Z) - Re-creation of Creations: A New Paradigm for Lyric-to-Melody Generation [158.54649047794794]
Re-creation of Creations (ROC) is a new paradigm for lyric-to-melody generation.
ROC achieves good lyric-melody feature alignment in lyric-to-melody generation.
arXiv Detail & Related papers (2022-08-11T08:44:47Z) - Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship
Attribution [74.27826764855911]
We employ syllabic quantity as a base for deriving rhythmic features for the task of computational authorship attribution of Latin prose texts.
Our experiments, carried out on three different datasets, using two different machine learning methods, show that rhythmic features based on syllabic quantity are beneficial in discriminating among Latin prose authors.
arXiv Detail & Related papers (2021-10-27T06:25:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.