MRecGen: Multimodal Appropriate Reaction Generator
- URL: http://arxiv.org/abs/2307.02609v1
- Date: Wed, 5 Jul 2023 19:07:00 GMT
- Title: MRecGen: Multimodal Appropriate Reaction Generator
- Authors: Jiaqi Xu, Cheng Luo, Weicheng Xie, Linlin Shen, Xiaofeng Liu, Lu Liu,
Hatice Gunes, Siyang Song
- Abstract summary: This paper proposes the first multiple and multimodal (verbal and nonverbal) appropriate human reaction generation framework.
It can be applied to various human-computer interaction scenarios by generating appropriate virtual agent/robot behaviours.
- Score: 31.60823534748163
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Verbal and non-verbal human reaction generation is a challenging task, as
different reactions could be appropriate for responding to the same behaviour.
This paper proposes the first multiple and multimodal (verbal and nonverbal)
appropriate human reaction generation framework that can generate appropriate
and realistic human-style reactions (displayed in the form of synchronised
text, audio and video streams) in response to an input user behaviour. This
novel technique can be applied to various human-computer interaction scenarios
by generating appropriate virtual agent/robot behaviours. Our demo is available
at \url{https://github.com/SSYSteve/MRecGen}.
Related papers
- EMOTION: Expressive Motion Sequence Generation for Humanoid Robots with In-Context Learning [10.266351600604612]
This paper introduces a framework, called EMOTION, for generating expressive motion sequences in humanoid robots.
We conduct online user studies comparing the naturalness and understandability of the motions generated by EMOTION and its human-feedback version, EMOTION++.
arXiv Detail & Related papers (2024-10-30T17:22:45Z) - ReGenNet: Towards Human Action-Reaction Synthesis [87.57721371471536]
We analyze the asymmetric, dynamic, synchronous, and detailed nature of human-human interactions.
We propose the first multi-setting human action-reaction benchmark to generate human reactions conditioned on given human actions.
arXiv Detail & Related papers (2024-03-18T15:33:06Z) - ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions [66.87211993793807]
We present ReMoS, a denoising diffusion based model that synthesizes full body motion of a person in two person interaction scenario.
We demonstrate ReMoS across challenging two person scenarios such as pair dancing, Ninjutsu, kickboxing, and acrobatics.
We also contribute the ReMoCap dataset for two person interactions containing full body and finger motions.
arXiv Detail & Related papers (2023-11-28T18:59:52Z) - ReactFace: Online Multiple Appropriate Facial Reaction Generation in Dyadic Interactions [46.66378299720377]
In dyadic interaction, predicting the listener's facial reactions is challenging as different reactions could be appropriate in response to the same speaker's behaviour.
This paper reformulates the task as an extrapolation or prediction problem, and proposes a novel framework (called ReactFace) to generate multiple different but appropriate facial reactions.
arXiv Detail & Related papers (2023-05-25T05:55:53Z) - Reversible Graph Neural Network-based Reaction Distribution Learning for
Multiple Appropriate Facial Reactions Generation [22.579200870471475]
This paper proposes the first multiple appropriate facial reaction generation framework.
It re-formulates the one-to-many mapping facial reaction generation problem as a one-to-one mapping problem.
Experimental results demonstrate that our approach outperforms existing models in generating more appropriate, realistic, and synchronized facial reactions.
arXiv Detail & Related papers (2023-05-24T15:56:26Z) - Multiple Appropriate Facial Reaction Generation in Dyadic Interaction
Settings: What, Why and How? [11.130984858239412]
This paper defines the Multiple Appropriate Reaction Generation task for the first time in the literature.
It then proposes a new set of objective evaluation metrics to evaluate the appropriateness of the generated reactions.
The paper subsequently introduces a framework to predict, generate, and evaluate multiple appropriate facial reactions.
arXiv Detail & Related papers (2023-02-13T16:49:27Z) - TEMOS: Generating diverse human motions from textual descriptions [53.85978336198444]
We address the problem of generating diverse 3D human motions from textual descriptions.
We propose TEMOS, a text-conditioned generative model leveraging variational autoencoder (VAE) training with human motion data.
We show that TEMOS framework can produce both skeleton-based animations as in prior work, as well more expressive SMPL body motions.
arXiv Detail & Related papers (2022-04-25T14:53:06Z) - Responsive Listening Head Generation: A Benchmark Dataset and Baseline [58.168958284290156]
We define the responsive listening head generation task as the synthesis of a non-verbal head with motions and expressions reacting to the multiple inputs.
Unlike speech-driven gesture or talking head generation, we introduce more modals in this task, hoping to benefit several research fields.
arXiv Detail & Related papers (2021-12-27T07:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.