Related papers: An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music Recommendation

An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music Recommendation

URL: http://arxiv.org/abs/2402.08300v1
Date: Tue, 13 Feb 2024 09:03:03 GMT
Title: An Order-Complexity Aesthetic Assessment Model for Aesthetic-aware Music Recommendation
Authors: Xin Jin, Wu Zhou, Jingyu Wang, Duo Xu, Yongsen Zheng
Abstract summary: subjective evaluation is still the most effective form of evaluating artistic works. While compared to music produced by humans, AI generated music still sounds mechanical, monotonous, and lacks aesthetic appeal. We use Birkhoff's aesthetic measure to design a aesthetic model, objectively measuring the aesthetic beauty of music, and form a recommendation list according to the aesthetic feeling of music.
Score: 20.164044758068634
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Computational aesthetic evaluation has made remarkable contribution to visual art works, but its application to music is still rare. Currently, subjective evaluation is still the most effective form of evaluating artistic works. However, subjective evaluation of artistic works will consume a lot of human and material resources. The popular AI generated content (AIGC) tasks nowadays have flooded all industries, and music is no exception. While compared to music produced by humans, AI generated music still sounds mechanical, monotonous, and lacks aesthetic appeal. Due to the lack of music datasets with rating annotations, we have to choose traditional aesthetic equations to objectively measure the beauty of music. In order to improve the quality of AI music generation and further guide computer music production, synthesis, recommendation and other tasks, we use Birkhoff's aesthetic measure to design a aesthetic model, objectively measuring the aesthetic beauty of music, and form a recommendation list according to the aesthetic feeling of music. Experiments show that our objective aesthetic model and recommendation method are effective.

Related papers

Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound [46.7144966835279]
This paper addresses the need for automated systems capable of predicting audio aesthetics without human intervention. We propose new annotation guidelines that decompose human listening perspectives into four distinct axes. We develop and train no-reference, per-item prediction models that offer a more nuanced assessment of audio quality.
arXiv Detail & Related papers (2025-02-07T18:15:57Z)
Rethinking Emotion Bias in Music via Frechet Audio Distance [11.89773040110695]
We conduct a study on Music Emotion Recognition (MER) and Emotional Music Generation (EMG) We employ diverse audio encoders alongside the Frechet Audio Distance (FAD), a reference-free evaluation metric.
arXiv Detail & Related papers (2024-09-23T20:59:15Z)
Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings [10.302353984541497]
This research develops a model capable of generating music that resonates with the emotions depicted in visual arts. Addressing the scarcity of aligned art and music data, we curated the Emotion Painting Music dataset. Our dual-stage framework converts images to text descriptions of emotional content and then transforms these descriptions into music, facilitating efficient learning with minimal data.
arXiv Detail & Related papers (2024-09-12T08:19:25Z)
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations. We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music. Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z)
MusicRL: Aligning Music Generation to Human Preferences [62.44903326718772]
MusicRL is the first music generation system finetuned from human feedback. We deploy MusicLM to users and collect a substantial dataset comprising 300,000 pairwise preferences. We train MusicRL-U, the first text-to-music model that incorporates human feedback at scale.
arXiv Detail & Related papers (2024-02-06T18:36:52Z)
MARBLE: Music Audio Representation Benchmark for Universal Evaluation [79.25065218663458]
We introduce the Music Audio Representation Benchmark for universaL Evaluation, termed MARBLE. It aims to provide a benchmark for various Music Information Retrieval (MIR) tasks by defining a comprehensive taxonomy with four hierarchy levels, including acoustic, performance, score, and high-level description. We then establish a unified protocol based on 14 tasks on 8 public-available datasets, providing a fair and standard assessment of representations of all open-sourced pre-trained models developed on music recordings as baselines.
arXiv Detail & Related papers (2023-06-18T12:56:46Z)
An Order-Complexity Model for Aesthetic Quality Assessment of Homophony Music Performance [8.751312368054016]
subjective evaluation is still a ultimate method of music aesthetics research. The music performance generated by AI is still mechanical, monotonous and lacking in beauty. This paper uses Birkhoff's aesthetic measure to propose a method of objective measurement of beauty.
arXiv Detail & Related papers (2023-04-23T03:02:24Z)
VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining [53.470662123170555]
We propose learning image aesthetics from user comments, and exploring vision-language pretraining methods to learn multimodal aesthetic representations. Specifically, we pretrain an image-text encoder-decoder model with image-comment pairs, using contrastive and generative objectives to learn rich and generic aesthetic semantics without human labels. Our results show that our pretrained aesthetic vision-language model outperforms prior works on image aesthetic captioning over the AVA-Captions dataset.
arXiv Detail & Related papers (2023-03-24T23:57:28Z)
An Order-Complexity Model for Aesthetic Quality Assessment of Symbolic Homophony Music Scores [8.751312368054016]
The quality of music score generated by AI is relatively poor compared with that created by human composers. This paper proposes an objective quantitative evaluation method for homophony music score aesthetic quality assessment.
arXiv Detail & Related papers (2023-01-14T12:30:16Z)
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training [97.91071692716406]
Symbolic music understanding refers to the understanding of music from the symbolic data. MusicBERT is a large-scale pre-trained model for music understanding.
arXiv Detail & Related papers (2021-06-10T10:13:05Z)
Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with Visual Computing for Improved Music Video Analysis [91.3755431537592]
This thesis combines audio-analysis with computer vision to approach Music Information Retrieval (MIR) tasks from a multi-modal perspective. The main hypothesis of this work is based on the observation that certain expressive categories such as genre or theme can be recognized on the basis of the visual content alone. The experiments are conducted for three MIR tasks Artist Identification, Music Genre Classification and Cross-Genre Classification.
arXiv Detail & Related papers (2020-02-01T17:57:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.