An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music
Recommendation
- URL: http://arxiv.org/abs/2402.08300v1
- Date: Tue, 13 Feb 2024 09:03:03 GMT
- Title: An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music
Recommendation
- Authors: Xin Jin, Wu Zhou, Jingyu Wang, Duo Xu, Yongsen Zheng
- Abstract summary: subjective evaluation is still the most effective form of evaluating artistic works.
While compared to music produced by humans, AI generated music still sounds mechanical, monotonous, and lacks aesthetic appeal.
We use Birkhoff's aesthetic measure to design a aesthetic model, objectively measuring the aesthetic beauty of music, and form a recommendation list according to the aesthetic feeling of music.
- Score: 20.164044758068634
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computational aesthetic evaluation has made remarkable contribution to visual
art works, but its application to music is still rare. Currently, subjective
evaluation is still the most effective form of evaluating artistic works.
However, subjective evaluation of artistic works will consume a lot of human
and material resources. The popular AI generated content (AIGC) tasks nowadays
have flooded all industries, and music is no exception. While compared to music
produced by humans, AI generated music still sounds mechanical, monotonous, and
lacks aesthetic appeal. Due to the lack of music datasets with rating
annotations, we have to choose traditional aesthetic equations to objectively
measure the beauty of music. In order to improve the quality of AI music
generation and further guide computer music production, synthesis,
recommendation and other tasks, we use Birkhoff's aesthetic measure to design a
aesthetic model, objectively measuring the aesthetic beauty of music, and form
a recommendation list according to the aesthetic feeling of music. Experiments
show that our objective aesthetic model and recommendation method are
effective.
Related papers
- Rethinking Emotion Bias in Music via Frechet Audio Distance [11.89773040110695]
We conduct a study on Music Emotion Recognition (MER) and Emotional Music Generation (EMG)
We employ diverse audio encoders alongside the Frechet Audio Distance (FAD), a reference-free evaluation metric.
arXiv Detail & Related papers (2024-09-23T20:59:15Z) - Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings [10.302353984541497]
This research develops a model capable of generating music that resonates with the emotions depicted in visual arts.
Addressing the scarcity of aligned art and music data, we curated the Emotion Painting Music dataset.
Our dual-stage framework converts images to text descriptions of emotional content and then transforms these descriptions into music, facilitating efficient learning with minimal data.
arXiv Detail & Related papers (2024-09-12T08:19:25Z) - MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations.
We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music.
Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z) - MusicRL: Aligning Music Generation to Human Preferences [62.44903326718772]
MusicRL is the first music generation system finetuned from human feedback.
We deploy MusicLM to users and collect a substantial dataset comprising 300,000 pairwise preferences.
We train MusicRL-U, the first text-to-music model that incorporates human feedback at scale.
arXiv Detail & Related papers (2024-02-06T18:36:52Z) - MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE.
It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description.
We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z) - An Order-Complexity Model for Aesthetic Quality Assessment of Homophony
Music Performance [8.751312368054016]
subjective evaluation is still a ultimate method of music aesthetics research.
The music performance generated by AI is still mechanical, monotonous and lacking in beauty.
This paper uses Birkhoff's aesthetic measure to propose a method of objective measurement of beauty.
arXiv Detail & Related papers (2023-04-23T03:02:24Z) - VILA: Learning Image Aesthetics from User Comments with Vision-Language
Pretraining [53.470662123170555]
We propose learning image aesthetics from user comments, and exploring vision-language pretraining methods to learn multimodal aesthetic representations.
Specifically, we pretrain an image-text encoder-decoder model with image-comment pairs, using contrastive and generative objectives to learn rich and generic aesthetic semantics without human labels.
Our results show that our pretrained aesthetic vision-language model outperforms prior works on image aesthetic captioning over the AVA-Captions dataset.
arXiv Detail & Related papers (2023-03-24T23:57:28Z) - An Order-Complexity Model for Aesthetic Quality Assessment of Symbolic
Homophony Music Scores [8.751312368054016]
The quality of music score generated by AI is relatively poor compared with that created by human composers.
This paper proposes an objective quantitative evaluation method for homophony music score aesthetic quality assessment.
arXiv Detail & Related papers (2023-01-14T12:30:16Z) - MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data.
MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z) - Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with
Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective.
The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone.
The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.