Related papers: Linking Faces and Voices Across Languages: Insights from the FAME 2026 Challenge

Linking Faces and Voices Across Languages: Insights from the FAME 2026 Challenge

URL: http://arxiv.org/abs/2512.20376v1
Date: Tue, 23 Dec 2025 14:00:34 GMT
Title: Linking Faces and Voices Across Languages: Insights from the FAME 2026 Challenge
Authors: Marta Moscati, Ahmed Abdullah, Muhammad Saad Saeed, Shah Nawaz, Rohan Kumar Das, Muhammad Zaigham Zaheer, Junaid Mir, Muhammad Haroon Yousaf, Khalid Mahmood Malik, Markus Schedl,
Abstract summary: The Face-Voice Association in Multilingual Environments (FAME) 2026 Challenge, held at ICASSP 2026, focuses on developing methods for face-voice association.<n>This report provides a brief summary of the challenge.
Score: 27.73711803720755
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Over half of the world's population is bilingual and people often communicate under multilingual scenarios. The Face-Voice Association in Multilingual Environments (FAME) 2026 Challenge, held at ICASSP 2026, focuses on developing methods for face-voice association that are effective when the language at test-time is different than the training one. This report provides a brief summary of the challenge.

Related papers

RFOP: Rethinking Fusion and Orthogonal Projection for Face-Voice Association [0.6024251635050109]
The challenge introduces English-German face-voice pairs to be utilized in the evaluation phase.<n>Our method performs favorably on the English-German data split and ranked 3rd in the FAME 2026 challenge by achieving the EER of 33.1.
arXiv Detail & Related papers (2025-12-02T15:21:21Z)
A Bridge from Audio to Video: Phoneme-Viseme Alignment Allows Every Face to Speak Multiple Languages [60.81571443992153]
Speech-driven talking face synthesis (TFS) focuses on generating facial animations from audio input.<n>Current models perform well in English but unsatisily in non-English languages, producing wrong mouth shapes and rigid facial expressions.<n>We propose Multilingual Experts (MuEx), a novel framework featuring a Phoneme-Guided Mixture-of-Experts architecture.
arXiv Detail & Related papers (2025-10-08T03:46:39Z)
Face-voice Association in Multilingual Environments (FAME) 2026 Challenge Evaluation Plan [24.480174322626155]
The Face-voice Association in Multilingual Environments (FAME) 2026 Challenge focuses on exploring face-voice association under a multilingual scenario.<n>This report provides the details of the challenge, dataset, baseline models, and task details for the FAME Challenge.
arXiv Detail & Related papers (2025-08-06T16:09:47Z)
SemEval-2025 Task 7: Multilingual and Crosslingual Fact-Checked Claim Retrieval [29.85035370846946]
The rapid spread of online disinformation presents a global challenge, and machine learning has been widely explored as a potential solution.<n>To address this gap, we conducted a shared task on multilingual claim retrieval at SemEval 2025.<n>We report the best-performing systems as well as the most common and the most effective approaches across both subtracks.
arXiv Detail & Related papers (2025-05-15T23:04:46Z)
Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan [29.23176868272216]
The Face-voice Association in Multilingual Environments (FAME) Challenge 2024 focuses on exploring face-voice association under a unique condition of multilingual scenario. This report provides the details of the challenge, dataset, baselines and task details for the FAME Challenge.
arXiv Detail & Related papers (2024-04-14T19:51:32Z)
Perception Test 2023: A Summary of the First Challenge And Outcome [67.0525378209708]
The First Perception Test challenge was held as a half-day workshop alongside the IEEE/CVF International Conference on Computer Vision (ICCV) 2023. The goal was to benchmarking state-of-the-art video models on the recently proposed Perception Test benchmark. We summarise in this report the task descriptions, metrics, baselines, and results.
arXiv Detail & Related papers (2023-12-20T15:12:27Z)
Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond [87.4049283495551]
The 2023 Multilingual Speech Universal Performance Benchmark (ML-SUPERB) Challenge expands upon the acclaimed SUPERB framework.<n>The challenge garnered 12 model submissions and 54 language corpora, resulting in a comprehensive benchmark encompassing 154 languages.<n>The findings indicate that merely scaling models is not the definitive solution for multilingual speech tasks.
arXiv Detail & Related papers (2023-10-09T08:30:01Z)
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation [79.66359274050885]
We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models. Our approach has demonstrated effectiveness in end-to-end speech-to-text translation tasks.
arXiv Detail & Related papers (2023-05-24T07:42:15Z)
ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition [96.87241233266448]
This track was introduced in the Zero-Resource Speech challenge, 2021 edition, 2nd round. We motivate the new track and discuss participation rules in detail. We also present the two baseline systems that were developed for this track.
arXiv Detail & Related papers (2021-07-14T08:29:07Z)
X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural Language Understanding and Question Answering [55.57776147848929]
We propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for Natural Language Understanding (NLU) Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages. We show that our approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages.
arXiv Detail & Related papers (2021-04-20T00:13:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.