'Studies for': A Human-AI Co-Creative Sound Artwork Using a Real-time Multi-channel Sound Generation Model
- URL: http://arxiv.org/abs/2510.25228v2
- Date: Fri, 31 Oct 2025 05:08:09 GMT
- Title: 'Studies for': A Human-AI Co-Creative Sound Artwork Using a Real-time Multi-channel Sound Generation Model
- Authors: Chihiro Nagashima, Akira Takahashi, Zhi Zhong, Shusuke Takahashi, Yuki Mitsufuji,
- Abstract summary: Studies for is a generative sound installation developed in collaboration with sound artist Evala.<n>The work is grounded in the concept of a "new form of archive"<n>We propose a Human-AI co-creation framework for effectively incorporating sound generation AI models into the sound art creation process.
- Score: 32.684356791049986
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper explores the integration of AI technologies into the artistic workflow through the creation of Studies for, a generative sound installation developed in collaboration with sound artist Evala (https://www.ntticc.or.jp/en/archive/works/studies-for/). The installation employs SpecMaskGIT, a lightweight yet high-quality sound generation AI model, to generate and playback eight-channel sound in real-time, creating an immersive auditory experience over the course of a three-month exhibition. The work is grounded in the concept of a "new form of archive," which aims to preserve the artistic style of an artist while expanding beyond artists' past artworks by continued generation of new sound elements. This speculative approach to archival preservation is facilitated by training the AI model on a dataset consisting of over 200 hours of Evala's past sound artworks. By addressing key requirements in the co-creation of art using AI, this study highlights the value of the following aspects: (1) the necessity of integrating artist feedback, (2) datasets derived from an artist's past works, and (3) ensuring the inclusion of unexpected, novel outputs. In Studies for, the model was designed to reflect the artist's artistic identity while generating new, previously unheard sounds, making it a fitting realization of the concept of "a new form of archive." We propose a Human-AI co-creation framework for effectively incorporating sound generation AI models into the sound art creation process and suggest new possibilities for creating and archiving sound art that extend an artist's work beyond their physical existence. Demo page: https://sony.github.io/studies-for/
Related papers
- Art2Mus: Artwork-to-Music Generation via Visual Conditioning and Large-Scale Cross-Modal Alignment [8.468469176803241]
We introduce ArtSound, a large-scale dataset of 105,884 artwork-music pairs enriched with dual-modality captions.<n>We propose ArtToMus, the first framework explicitly designed for direct artwork-to-music generation.<n>ArtToMus maps digitized artworks to music without image-to-text translation or language-based semantic supervision.
arXiv Detail & Related papers (2026-02-19T18:23:58Z) - The Ghost in the Keys: A Disklavier Demo for Human-AI Musical Co-Creativity [59.78509280246215]
Aria-Duet is an interactive system facilitating a real-time musical duet between a human pianist and Aria, a state-of-the-art generative model.<n>We analyze the system's output from a musicological perspective, finding the model can maintain stylistic semantics and develop coherent phrasal ideas.
arXiv Detail & Related papers (2025-11-03T15:26:01Z) - ArtistAuditor: Auditing Artist Style Pirate in Text-to-Image Generation Models [61.55816738318699]
We propose a novel method for data-use auditing in the text-to-image generation model.<n>ArtistAuditor employs a style extractor to obtain the multi-granularity style representations and treats artworks as samplings of an artist's style.<n>The experimental results on six combinations of models and datasets show that ArtistAuditor can achieve high AUC values.
arXiv Detail & Related papers (2025-04-17T16:15:38Z) - Expertise elevates AI usage: experimental evidence comparing laypeople and professional artists [1.5296069874080693]
We compare the artistic capabilities of artists and laypeople using generative AI.<n>On average, artists produced more faithful and creative outputs than their lay counterparts.<n>While AI may ease content creation, professional expertise is still valuable.
arXiv Detail & Related papers (2025-01-21T18:53:21Z) - Art2Mus: Bridging Visual Arts and Music through Cross-Modal Generation [8.185890043443601]
We introduce $mathcalAtextitrt2mathcalMtextitus$, a novel model designed to create music from digitized artworks or text inputs.
Experimental results demonstrate that $mathcalAtextitrt2mathcalMtextitus$ can generate music that resonates with the input stimuli.
arXiv Detail & Related papers (2024-10-07T10:48:08Z) - Diffusion-Based Visual Art Creation: A Survey and New Perspectives [51.522935314070416]
This survey explores the emerging realm of diffusion-based visual art creation, examining its development from both artistic and technical perspectives.
Our findings reveal how artistic requirements are transformed into technical challenges and highlight the design and application of diffusion-based methods within visual art creation.
We aim to shed light on the mechanisms through which AI systems emulate and possibly, enhance human capacities in artistic perception and creativity.
arXiv Detail & Related papers (2024-08-22T04:49:50Z) - Equivalence: An analysis of artists' roles with Image Generative AI from Conceptual Art perspective through an interactive installation design practice [16.063735487844628]
This study explores how artists interact with advanced text-to-image Generative AI models.
To exemplify this framework, a case study titled "Equivalence" converts users' speech input into continuously evolving paintings.
This work aims to broaden our understanding of artists' roles and foster a deeper appreciation for the creative aspects inherent in artwork created with Image Generative AI.
arXiv Detail & Related papers (2024-04-29T02:45:23Z) - CreativeSynth: Cross-Art-Attention for Artistic Image Synthesis with Multimodal Diffusion [73.08710648258985]
Key painting attributes including layout, perspective, shape, and semantics often cannot be conveyed and expressed through style transfer.<n>Large-scale pretrained text-to-image generation models have demonstrated their capability to synthesize a vast amount of high-quality images.<n>Our main novel idea is to integrate multimodal semantic information as a synthesis guide into artworks, rather than transferring style to the real world.
arXiv Detail & Related papers (2024-01-25T10:42:09Z) - Novel-View Acoustic Synthesis [140.1107768313269]
We introduce the novel-view acoustic synthesis (NVAS) task.
given the sight and sound observed at a source viewpoint, can we synthesize the sound of that scene from an unseen target viewpoint?
We propose a neural rendering approach: Visually-Guided Acoustic Synthesis (ViGAS) network that learns to synthesize the sound of an arbitrary point in space.
arXiv Detail & Related papers (2023-01-20T18:49:58Z) - Pathway to Future Symbiotic Creativity [76.20798455931603]
We propose a classification of the creative system with a hierarchy of 5 classes, showing the pathway of creativity evolving from a mimic-human artist to a Machine artist in its own right.
In art creation, it is necessary for machines to understand humans' mental states, including desires, appreciation, and emotions, humans also need to understand machines' creative capabilities and limitations.
We propose a novel framework for building future Machine artists, which comes with the philosophy that a human-compatible AI system should be based on the "human-in-the-loop" principle.
arXiv Detail & Related papers (2022-08-18T15:12:02Z) - Generating Music and Generative Art from Brain activity [0.0]
This research work introduces a computational system for creating generative art using a Brain-Computer Interface (BCI)
The generated artwork uses brain signals and concepts of geometry, color and spatial location to give complexity to the autonomous construction.
arXiv Detail & Related papers (2021-08-09T19:33:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.