Challenges in creative generative models for music: a divergence
maximization perspective
- URL: http://arxiv.org/abs/2211.08856v1
- Date: Wed, 16 Nov 2022 12:02:43 GMT
- Title: Challenges in creative generative models for music: a divergence
maximization perspective
- Authors: Axel Chemla--Romeu-Santos, Philippe Esling
- Abstract summary: Development of generative Machine Learning models in creative practices is raising more interest among artists, practitioners and performers.
Most models are still unable to generate content that lay outside of the domain defined by the training dataset.
We propose an alternative prospective framework, starting from a new general formulation of ML objectives.
- Score: 3.655021726150369
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The development of generative Machine Learning (ML) models in creative
practices, enabled by the recent improvements in usability and availability of
pre-trained models, is raising more and more interest among artists,
practitioners and performers. Yet, the introduction of such techniques in
artistic domains also revealed multiple limitations that escape current
evaluation methods used by scientists. Notably, most models are still unable to
generate content that lay outside of the domain defined by the training
dataset. In this paper, we propose an alternative prospective framework,
starting from a new general formulation of ML objectives, that we derive to
delineate possible implications and solutions that already exist in the ML
literature (notably for the audio and musical domain). We also discuss existing
relations between generative models and computational creativity and how our
framework could help address the lack of creativity in existing models.
Related papers
- Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community.
There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z) - Generative AI in Vision: A Survey on Models, Metrics and Applications [0.0]
Generative AI models have revolutionized various fields by enabling the creation of realistic and diverse data samples.
Among these models, diffusion models have emerged as a powerful approach for generating high-quality images, text, and audio.
This survey paper provides a comprehensive overview of generative AI diffusion and legacy models, focusing on their underlying techniques, applications across different domains, and their challenges.
arXiv Detail & Related papers (2024-02-26T07:47:12Z) - ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior
Constraints [56.824187892204314]
We present the task of creative text-to-image generation, where we seek to generate new members of a broad category.
We show that the creative generation problem can be formulated as an optimization process over the output space of the diffusion prior.
We incorporate a question-answering Vision-Language Model (VLM) that adaptively adds new constraints to the optimization problem, encouraging the model to discover increasingly more unique creations.
arXiv Detail & Related papers (2023-08-03T17:04:41Z) - MinT: Boosting Generalization in Mathematical Reasoning via Multi-View
Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs)
We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles.
Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z) - The Creative Frontier of Generative AI: Managing the Novelty-Usefulness
Tradeoff [0.4873362301533825]
We explore the optimal balance between novelty and usefulness in generative Artificial Intelligence (AI) systems.
Overemphasizing either aspect can lead to limitations such as hallucinations and memorization.
arXiv Detail & Related papers (2023-06-06T11:44:57Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z) - Foundation Models for Decision Making: Problems, Methods, and
Opportunities [124.79381732197649]
Foundation models pretrained on diverse data at scale have demonstrated extraordinary capabilities in a wide range of vision and language tasks.
New paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning.
Research at the intersection of foundation models and decision making holds tremendous promise for creating powerful new systems.
arXiv Detail & Related papers (2023-03-07T18:44:07Z) - Foundation models in brief: A historical, socio-technical focus [2.5991265608180396]
Foundation models can be disruptive for future AI development by scaling up deep learning.
Models achieve state-of-the-art performance on a variety of tasks in domains such as natural language processing and computer vision.
arXiv Detail & Related papers (2022-12-17T22:11:33Z) - Creative divergent synthesis with generative models [3.655021726150369]
Machine learning approaches now achieve impressive generation capabilities in numerous domains such as image, audio or video.
We propose various perspectives on how this complicated goal could ever be achieved, and provide preliminary results on our novel training objective called textitBounded Adversarial Divergence (BAD)
arXiv Detail & Related papers (2022-11-16T12:12:31Z) - Towards Interpretable Deep Reinforcement Learning Models via Inverse
Reinforcement Learning [27.841725567976315]
We propose a novel framework utilizing Adversarial Inverse Reinforcement Learning.
This framework provides global explanations for decisions made by a Reinforcement Learning model.
We capture intuitive tendencies that the model follows by summarizing the model's decision-making process.
arXiv Detail & Related papers (2022-03-30T17:01:59Z) - Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models.
Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning.
Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.