Challenges in creative generative models for music: a divergence
maximization perspective
- URL: http://arxiv.org/abs/2211.08856v1
- Date: Wed, 16 Nov 2022 12:02:43 GMT
- Title: Challenges in creative generative models for music: a divergence
maximization perspective
- Authors: Axel Chemla--Romeu-Santos, Philippe Esling
- Abstract summary: Development of generative Machine Learning models in creative practices is raising more interest among artists, practitioners and performers.
Most models are still unable to generate content that lay outside of the domain defined by the training dataset.
We propose an alternative prospective framework, starting from a new general formulation of ML objectives.
- Score: 3.655021726150369
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The development of generative Machine Learning (ML) models in creative
practices, enabled by the recent improvements in usability and availability of
pre-trained models, is raising more and more interest among artists,
practitioners and performers. Yet, the introduction of such techniques in
artistic domains also revealed multiple limitations that escape current
evaluation methods used by scientists. Notably, most models are still unable to
generate content that lay outside of the domain defined by the training
dataset. In this paper, we propose an alternative prospective framework,
starting from a new general formulation of ML objectives, that we derive to
delineate possible implications and solutions that already exist in the ML
literature (notably for the audio and musical domain). We also discuss existing
relations between generative models and computational creativity and how our
framework could help address the lack of creativity in existing models.
Related papers
- Untapped Potential in Self-Optimization of Hopfield Networks: The Creativity of Unsupervised Learning [0.6144680854063939]
We argue that the Self-Optimization (SO) model satisfies the necessary and sufficient conditions of a creative process.
We conclude that the SO model allows for simulating and understanding the emergence of creative behaviors in artificial systems that learn.
arXiv Detail & Related papers (2024-12-10T11:58:39Z) - Boosting Alignment for Post-Unlearning Text-to-Image Generative Models [55.82190434534429]
Large-scale generative models have shown impressive image-generation capabilities, propelled by massive data.
This often inadvertently leads to the generation of harmful or inappropriate content and raises copyright concerns.
We propose a framework that seeks an optimal model update at each unlearning iteration, ensuring monotonic improvement on both objectives.
arXiv Detail & Related papers (2024-12-09T21:36:10Z) - Recommendation with Generative Models [35.029116616023586]
Generative models are AI models capable of creating new instances of data by learning and sampling from their statistical distributions.
These models have applications across various domains, such as image generation, text synthesis, and music composition.
In recommender systems, generative models, referred to as Gen-RecSys, improve the accuracy and diversity of recommendations.
arXiv Detail & Related papers (2024-09-18T18:29:15Z) - Generative AI in Vision: A Survey on Models, Metrics and Applications [0.0]
Generative AI models have revolutionized various fields by enabling the creation of realistic and diverse data samples.
Among these models, diffusion models have emerged as a powerful approach for generating high-quality images, text, and audio.
This survey paper provides a comprehensive overview of generative AI diffusion and legacy models, focusing on their underlying techniques, applications across different domains, and their challenges.
arXiv Detail & Related papers (2024-02-26T07:47:12Z) - Learning from models beyond fine-tuning [78.20895343699658]
Learn From Model (LFM) focuses on the research, modification, and design of foundation models (FM) based on the model interface.
The study of LFM techniques can be broadly categorized into five major areas: model tuning, model distillation, model reuse, meta learning and model editing.
This paper gives a comprehensive review of the current methods based on FM from the perspective of LFM.
arXiv Detail & Related papers (2023-10-12T10:20:36Z) - ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior
Constraints [56.824187892204314]
We present the task of creative text-to-image generation, where we seek to generate new members of a broad category.
We show that the creative generation problem can be formulated as an optimization process over the output space of the diffusion prior.
We incorporate a question-answering Vision-Language Model (VLM) that adaptively adds new constraints to the optimization problem, encouraging the model to discover increasingly more unique creations.
arXiv Detail & Related papers (2023-08-03T17:04:41Z) - MinT: Boosting Generalization in Mathematical Reasoning via Multi-View
Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs)
We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles.
Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z) - Foundation models in brief: A historical, socio-technical focus [2.5991265608180396]
Foundation models can be disruptive for future AI development by scaling up deep learning.
Models achieve state-of-the-art performance on a variety of tasks in domains such as natural language processing and computer vision.
arXiv Detail & Related papers (2022-12-17T22:11:33Z) - Creative divergent synthesis with generative models [3.655021726150369]
Machine learning approaches now achieve impressive generation capabilities in numerous domains such as image, audio or video.
We propose various perspectives on how this complicated goal could ever be achieved, and provide preliminary results on our novel training objective called textitBounded Adversarial Divergence (BAD)
arXiv Detail & Related papers (2022-11-16T12:12:31Z) - Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models.
Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning.
Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.