Shimon the Rapper: A Real-Time System for Human-Robot Interactive Rap
Battles
- URL: http://arxiv.org/abs/2009.09234v1
- Date: Sat, 19 Sep 2020 14:04:54 GMT
- Title: Shimon the Rapper: A Real-Time System for Human-Robot Interactive Rap
Battles
- Authors: Richard Savery, Lisa Zahray, Gil Weinberg
- Abstract summary: We present a system for real-time lyrical improvisation between a human and a robot in the style of hip hop.
Our system takes vocal input from a human rapper, analyzes the semantic meaning, and generates a response that is rapped back by a robot over a musical groove.
- Score: 1.7403133838762446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a system for real-time lyrical improvisation between a human and a
robot in the style of hip hop. Our system takes vocal input from a human
rapper, analyzes the semantic meaning, and generates a response that is rapped
back by a robot over a musical groove. Previous work with real-time interactive
music systems has largely focused on instrumental output, and vocal
interactions with robots have been explored, but not in a musical context. Our
generative system includes custom methods for censorship, voice, rhythm,
rhyming and a novel deep learning pipeline based on phoneme embeddings. The rap
performances are accompanied by synchronized robotic gestures and mouth
movements. Key technical challenges that were overcome in the system are
developing rhymes, performing with low-latency and dataset censorship. We
evaluated several aspects of the system through a survey of videos and sample
text output. Analysis of comments showed that the overall perception of the
system was positive. The model trained on our hip hop dataset was rated
significantly higher than our metal dataset in coherence, rhyme quality, and
enjoyment. Participants preferred outputs generated by a given input phrase
over outputs generated from unknown keywords, indicating that the system
successfully relates its output to its input.
Related papers
- Creativity and Visual Communication from Machine to Musician: Sharing a Score through a Robotic Camera [4.9485163144728235]
This paper explores the integration of visual communication and musical interaction by implementing a robotic camera within a "Guided Harmony" musical game.
The robotic system interprets and responds to nonverbal cues from musicians, creating a collaborative and adaptive musical experience.
arXiv Detail & Related papers (2024-09-09T16:34:36Z) - Affective social anthropomorphic intelligent system [1.7849339006560665]
This research proposes an anthropomorphic intelligent system that can hold a proper human-like conversation with emotion and personality.
A voice style transfer method is also proposed to map the attributes of a specific emotion.
arXiv Detail & Related papers (2023-04-19T18:24:57Z) - Play it by Ear: Learning Skills amidst Occlusion through Audio-Visual
Imitation Learning [62.83590925557013]
We learn a set of challenging partially-observed manipulation tasks from visual and audio inputs.
Our proposed system learns these tasks by combining offline imitation learning from tele-operated demonstrations and online finetuning.
In a set of simulated tasks, we find that our system benefits from using audio, and that by using online interventions we are able to improve the success rate of offline imitation learning by 20%.
arXiv Detail & Related papers (2022-05-30T04:52:58Z) - Quantized GAN for Complex Music Generation from Dance Videos [48.196705493763986]
We present Dance2Music-GAN (D2M-GAN), a novel adversarial multi-modal framework that generates musical samples conditioned on dance videos.
Our proposed framework takes dance video frames and human body motion as input, and learns to generate music samples that plausibly accompany the corresponding input.
arXiv Detail & Related papers (2022-04-01T17:53:39Z) - Youling: an AI-Assisted Lyrics Creation System [72.00418962906083]
This paper demonstrates textitYouling, an AI-assisted lyrics creation system, designed to collaborate with music creators.
In the lyrics generation process, textitYouling supports traditional one pass full-text generation mode as well as an interactive generation mode.
The system also provides a revision module which enables users to revise undesired sentences or words of lyrics repeatedly.
arXiv Detail & Related papers (2022-01-18T03:57:04Z) - Responsive Listening Head Generation: A Benchmark Dataset and Baseline [58.168958284290156]
We define the responsive listening head generation task as the synthesis of a non-verbal head with motions and expressions reacting to the multiple inputs.
Unlike speech-driven gesture or talking head generation, we introduce more modals in this task, hoping to benefit several research fields.
arXiv Detail & Related papers (2021-12-27T07:18:50Z) - DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling [102.50840749005256]
Previous works for rap generation focused on rhyming lyrics but ignored rhythmic beats, which are important for rap performance.
In this paper, we develop DeepRapper, a Transformer-based rap generation system that can model both rhymes and rhythms.
arXiv Detail & Related papers (2021-07-05T09:01:46Z) - LyricJam: A system for generating lyrics for live instrumental music [11.521519161773288]
We describe a real-time system that receives a live audio stream from a jam session and generates lyric lines that are congruent with the live music being played.
Two novel approaches are proposed to align the learned latent spaces of audio and text representations.
arXiv Detail & Related papers (2021-06-03T16:06:46Z) - Melody-Conditioned Lyrics Generation with SeqGANs [81.2302502902865]
We propose an end-to-end melody-conditioned lyrics generation system based on Sequence Generative Adversarial Networks (SeqGAN)
We show that the input conditions have no negative impact on the evaluation metrics while enabling the network to produce more meaningful results.
arXiv Detail & Related papers (2020-10-28T02:35:40Z) - Rapformer: Conditional Rap Lyrics Generation with Denoising Autoencoders [14.479052867589417]
We develop a method for synthesizing a rap verse based on the content of any text (e.g., a news article)
Our method, called Rapformer, is based on training a Transformer-based denoising autoencoder to reconstruct rap lyrics from content words extracted from the lyrics.
Rapformer is capable of generating technically fluent verses that offer a good trade-off between content preservation and style transfer.
arXiv Detail & Related papers (2020-04-08T12:24:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.