REACT 2025: the Third Multiple Appropriate Facial Reaction Generation Challenge
- URL: http://arxiv.org/abs/2505.17223v1
- Date: Thu, 22 May 2025 18:55:23 GMT
- Title: REACT 2025: the Third Multiple Appropriate Facial Reaction Generation Challenge
- Authors: Siyang Song, Micol Spitale, Xiangyu Kong, Hengde Zhu, Cheng Luo, Cristina Palmero, German Barquero, Sergio Escalera, Michel Valstar, Mohamed Daoudi, Tobias Baur, Fabien Ringeval, Andrew Howes, Elisabeth Andre, Hatice Gunes,
- Abstract summary: In dyadic interactions, a broad spectrum of human facial reactions might be appropriate for responding to each human speaker behaviour.<n>We are proposing the REACT 2025 challenge encouraging the development and benchmarking of Machine Learning (ML) models.<n>We provide challenge participants with the first natural and large-scale multi-modal MAFRG dataset (called MARS) recording 137 human-human dyadic interactions.
- Score: 42.33323347077101
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In dyadic interactions, a broad spectrum of human facial reactions might be appropriate for responding to each human speaker behaviour. Following the successful organisation of the REACT 2023 and REACT 2024 challenges, we are proposing the REACT 2025 challenge encouraging the development and benchmarking of Machine Learning (ML) models that can be used to generate multiple appropriate, diverse, realistic and synchronised human-style facial reactions expressed by human listeners in response to an input stimulus (i.e., audio-visual behaviours expressed by their corresponding speakers). As a key of the challenge, we provide challenge participants with the first natural and large-scale multi-modal MAFRG dataset (called MARS) recording 137 human-human dyadic interactions containing a total of 2856 interaction sessions covering five different topics. In addition, this paper also presents the challenge guidelines and the performance of our baselines on the two proposed sub-challenges: Offline MAFRG and Online MAFRG, respectively. The challenge baseline code is publicly available at https://github.com/reactmultimodalchallenge/baseline_react2025
Related papers
- OmniResponse: Online Multimodal Conversational Response Generation in Dyadic Interactions [50.705439960008235]
We introduce Online Multimodal Conversational Response Generation (OMCRG), a novel task that aims to online generate synchronized verbal and non-verbal listener feedback.<n>We propose OmniResponse, a Multimodal Large Language Model (MLLM) that autoregressively generates high-quality multi-modal listener responses.<n>We present ResponseNet, a new dataset comprising 696 high-quality dyadic interactions featuring synchronized split-screen videos, multichannel audio, transcripts, and facial behavior annotations.
arXiv Detail & Related papers (2025-05-27T20:12:46Z) - Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning [55.127202990679976]
We introduce the MERR dataset, containing 28,618 coarse-grained and 4,487 fine-grained annotated samples across diverse emotional categories.
This dataset enables models to learn from varied scenarios and generalize to real-world applications.
We propose Emotion-LLaMA, a model that seamlessly integrates audio, visual, and textual inputs through emotion-specific encoders.
arXiv Detail & Related papers (2024-06-17T03:01:22Z) - The MuSe 2024 Multimodal Sentiment Analysis Challenge: Social Perception and Humor Recognition [64.5207572897806]
The Multimodal Sentiment Analysis Challenge (MuSe) 2024 addresses two contemporary multimodal affect and sentiment analysis problems.
In the Social Perception Sub-Challenge (MuSe-Perception), participants will predict 16 different social attributes of individuals.
The Cross-Cultural Humor Detection Sub-Challenge (MuSe-Humor) dataset expands upon the Passau Spontaneous Football Coach Humor dataset.
arXiv Detail & Related papers (2024-06-11T22:26:20Z) - REACT 2024: the Second Multiple Appropriate Facial Reaction Generation
Challenge [36.84914349494818]
In dyadic interactions, humans communicate their intentions and state of mind using verbal and non-verbal cues.
How to develop a machine learning (ML) model that can automatically generate multiple appropriate, diverse, realistic and synchronised human facial reactions is a challenging task.
This paper presents the guidelines of the REACT 2024 challenge and the dataset utilized in the challenge.
arXiv Detail & Related papers (2024-01-10T14:01:51Z) - The GENEA Challenge 2023: A large scale evaluation of gesture generation
models in monadic and dyadic settings [8.527975206444742]
This paper reports on the GENEA Challenge 2023, in which participating teams built speech-driven gesture-generation systems.
We evaluated 12 submissions and 2 baselines together with held-out motion-capture data in several large-scale user studies.
We found a large span in human-likeness between challenge submissions, with a few systems rated close to human mocap.
arXiv Detail & Related papers (2023-08-24T08:42:06Z) - MRecGen: Multimodal Appropriate Reaction Generator [31.60823534748163]
This paper proposes the first multiple and multimodal (verbal and nonverbal) appropriate human reaction generation framework.
It can be applied to various human-computer interaction scenarios by generating appropriate virtual agent/robot behaviours.
arXiv Detail & Related papers (2023-07-05T19:07:00Z) - REACT2023: the first Multi-modal Multiple Appropriate Facial Reaction
Generation Challenge [28.777465429875303]
The Multi-modal Multiple Appropriate Facial Reaction Generation Challenge (REACT2023) is the first competition event focused on evaluating multimedia processing and machine learning techniques for generating human-appropriate facial reactions in various dyadic interaction scenarios.
The goal of the challenge is to provide the first benchmark test set for multi-modal information processing and to foster collaboration among the audio, visual, and audio-visual affective computing communities.
arXiv Detail & Related papers (2023-06-11T04:15:56Z) - The MuSe 2023 Multimodal Sentiment Analysis Challenge: Mimicked
Emotions, Cross-Cultural Humour, and Personalisation [69.13075715686622]
MuSe 2023 is a set of shared tasks addressing three different contemporary multimodal affect and sentiment analysis problems.
MuSe 2023 seeks to bring together a broad audience from different research communities.
arXiv Detail & Related papers (2023-05-05T08:53:57Z) - Multimodal Feature Extraction and Fusion for Emotional Reaction
Intensity Estimation and Expression Classification in Videos with
Transformers [47.16005553291036]
We present our solutions to the two sub-challenges of Affective Behavior Analysis in the wild (ABAW) 2023.
For the Expression Classification Challenge, we propose a streamlined approach that handles the challenges of classification effectively.
By studying, analyzing, and combining these features, we significantly enhance the model's accuracy for sentiment prediction in a multimodal context.
arXiv Detail & Related papers (2023-03-16T09:03:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.