Related papers: Go witheFlow: Real-time Emotion Driven Audio Effects Modulation

Go witheFlow: Real-time Emotion Driven Audio Effects Modulation

URL: http://arxiv.org/abs/2510.02171v1
Date: Thu, 02 Oct 2025 16:23:47 GMT
Title: Go witheFlow: Real-time Emotion Driven Audio Effects Modulation
Authors: Edmund Dervakos, Spyridon Kantarelis, Vassilis Lyberatos, Jason Liartis, Giorgos Stamou,
Abstract summary: We introduce the witheFlow system, designed to enhance real-time music performance by automatically modulating audio effects.<n>The system, currently in a proof-of-concept phase, is designed to be lightweight, able to run locally on a laptop, and is open-source.
Score: 9.748164997490056
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Music performance is a distinctly human activity, intrinsically linked to the performer's ability to convey, evoke, or express emotion. Machines cannot perform music in the human sense; they can produce, reproduce, execute, or synthesize music, but they lack the capacity for affective or emotional experience. As such, music performance is an ideal candidate through which to explore aspects of collaboration between humans and machines. In this paper, we introduce the witheFlow system, designed to enhance real-time music performance by automatically modulating audio effects based on features extracted from both biosignals and the audio itself. The system, currently in a proof-of-concept phase, is designed to be lightweight, able to run locally on a laptop, and is open-source given the availability of a compatible Digital Audio Workstation and sensors.

Related papers

SyMuPe: Affective and Controllable Symbolic Music Performance [0.00746020873338928]
We present SyMuPe, a novel framework for developing and training affective and controllable piano performance models.<n>Our flagship model, PianoFlow, uses conditional flow matching trained to solve diverse multi-mask performance inpainting tasks.<n>For emotion control, we present and analyze samples generated under different text conditioning scenarios.
arXiv Detail & Related papers (2025-11-05T12:42:08Z)
The Ghost in the Keys: A Disklavier Demo for Human-AI Musical Co-Creativity [59.78509280246215]
Aria-Duet is an interactive system facilitating a real-time musical duet between a human pianist and Aria, a state-of-the-art generative model.<n>We analyze the system's output from a musicological perspective, finding the model can maintain stylistic semantics and develop coherent phrasal ideas.
arXiv Detail & Related papers (2025-11-03T15:26:01Z)
A Real-Time Gesture-Based Control Framework [2.432598153985671]
We introduce a real-time, human-in-the-loop gesture control framework.<n>It can dynamically adapt audio and music based on human movement.<n>System is designed for live performances, interactive installations, and personal use.
arXiv Detail & Related papers (2025-04-28T03:57:28Z)
MuVi: Video-to-Music Generation with Semantic Alignment and Rhythmic Synchronization [52.498942604622165]
This paper presents MuVi, a framework to generate music that aligns with video content. MuVi analyzes video content through a specially designed visual adaptor to extract contextually and temporally relevant features. We show that MuVi demonstrates superior performance in both audio quality and temporal synchronization.
arXiv Detail & Related papers (2024-10-16T18:44:56Z)
A Survey of Foundation Models for Music Understanding [60.83532699497597]
This work is one of the early reviews of the intersection of AI techniques and music understanding. We investigated, analyzed, and tested recent large-scale music foundation models in respect of their music comprehension abilities.
arXiv Detail & Related papers (2024-09-15T03:34:14Z)
Bridging Paintings and Music -- Exploring Emotion based Music Generation through Paintings [10.302353984541497]
This research develops a model capable of generating music that resonates with the emotions depicted in visual arts. Addressing the scarcity of aligned art and music data, we curated the Emotion Painting Music dataset. Our dual-stage framework converts images to text descriptions of emotional content and then transforms these descriptions into music, facilitating efficient learning with minimal data.
arXiv Detail & Related papers (2024-09-12T08:19:25Z)
Emotion Manipulation Through Music -- A Deep Learning Interactive Visual Approach [0.0]
We introduce a novel way to manipulate the emotional content of a song using AI tools. Our goal is to achieve the desired emotion while leaving the original melody as intact as possible. This research may contribute to on-demand custom music generation, the automated remixing of existing work, and music playlists tuned for emotional progression.
arXiv Detail & Related papers (2024-06-12T20:12:29Z)
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models [57.47799823804519]
We are inspired by how musicians compose music not just from a movie script, but also through visualizations. We propose MeLFusion, a model that can effectively use cues from a textual description and the corresponding image to synthesize music. Our exhaustive experimental evaluation suggests that adding visual information to the music synthesis pipeline significantly improves the quality of generated music.
arXiv Detail & Related papers (2024-06-07T06:38:59Z)
Affective Idiosyncratic Responses to Music [63.969810774018775]
We develop methods to measure affective responses to music from over 403M listener comments on a Chinese social music platform. We test for musical, lyrical, contextual, demographic, and mental health effects that drive listener affective responses.
arXiv Detail & Related papers (2022-10-17T19:57:46Z)
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling [6.256118777336895]
Musical expression requires control of both what notes are played, and how they are performed. We introduce MIDI-DDSP, a hierarchical model of musical instruments that enables both realistic neural audio synthesis and detailed user control. We demonstrate that this hierarchy can reconstruct high-fidelity audio, accurately predict performance attributes for a note sequence, independently manipulate the attributes of a given performance, and as a complete system, generate realistic audio from a novel note sequence.
arXiv Detail & Related papers (2021-12-17T04:15:42Z)
A Human-Computer Duet System for Music Performance [7.777761975348974]
We create a virtual violinist who can collaborate with a human pianist to perform chamber music automatically without any intervention. The system incorporates the techniques from various fields, including real-time music tracking, pose estimation, and body movement generation. The proposed system has been validated in public concerts.
arXiv Detail & Related papers (2020-09-16T17:19:23Z)
Foley Music: Learning to Generate Music from Videos [115.41099127291216]
Foley Music is a system that can synthesize plausible music for a silent video clip about people playing musical instruments. We first identify two key intermediate representations for a successful video to music generator: body keypoints from videos and MIDI events from audio recordings. We present a Graph$-$Transformer framework that can accurately predict MIDI event sequences in accordance with the body movements.
arXiv Detail & Related papers (2020-07-21T17:59:06Z)
Music Gesture for Visual Sound Separation [121.36275456396075]
"Music Gesture" is a keypoint-based structured representation to explicitly model the body and finger movements of musicians when they perform music. We first adopt a context-aware graph network to integrate visual semantic context with body dynamics, and then apply an audio-visual fusion model to associate body movements with the corresponding audio signals.
arXiv Detail & Related papers (2020-04-20T17:53:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.