Related papers: ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024

ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024

URL: http://arxiv.org/abs/2407.12038v2
Date: Wed, 31 Jul 2024 14:23:00 GMT
Title: ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024
Authors: Ruibo Fu, Rui Liu, Chunyu Qiang, Yingming Gao, Yi Lu, Shuchen Shi, Tao Wang, Ya Li, Zhengqi Wen, Chen Zhang, Hui Bu, Yukun Liu, Xin Qi, Guanjun Li,
Abstract summary: TheICAGC 2024 challenge aims to enhance the persuasiveness and acceptability of synthesized audio. A total of 19 teams have registered for the challenge, and the results of the competition and the competition are described in this paper.
Score: 32.96984318966757
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The Inspirational and Convincing Audio Generation Challenge 2024 (ICAGC 2024) is part of the ISCSLP 2024 Competitions and Challenges track. While current text-to-speech (TTS) technology can generate high-quality audio, its ability to convey complex emotions and controlled detail content remains limited. This constraint leads to a discrepancy between the generated audio and human subjective perception in practical applications like companion robots for children and marketing bots. The core issue lies in the inconsistency between high-quality audio generation and the ultimate human subjective experience. Therefore, this challenge aims to enhance the persuasiveness and acceptability of synthesized audio, focusing on human alignment convincing and inspirational audio generation. A total of 19 teams have registered for the challenge, and the results of the competition and the competition are described in this paper.

Related papers

Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge [102.84031769492708]
This task defines three QA subsets to test audio-language models on interactive question-answering over diverse acoustic scenes.<n>Preliminary results on the development set are compared, showing strong variation across models and subsets.<n>This challenge aims to advance the audio understanding and reasoning capabilities of audio-language models toward human-level acuity.
arXiv Detail & Related papers (2025-05-12T09:04:16Z)
Sound Scene Synthesis at the DCASE 2024 Challenge [8.170174172545831]
This paper presents Task 7 at the DCASE 2024 Challenge: sound scene synthesis. Recent advances in sound synthesis and generative models have enabled the creation of realistic and diverse audio content. We introduce a standardized evaluation framework for comparing different sound scene synthesis systems, incorporating both objective and subjective metrics.
arXiv Detail & Related papers (2025-01-15T05:15:54Z)
The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings [18.994388357437924]
The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge aims to benchmark and advance zero-shot spontaneous style voice cloning. This paper details the data, tracks, submitted systems, evaluation results, and findings.
arXiv Detail & Related papers (2024-10-31T09:39:49Z)
Overview of AI-Debater 2023: The Challenges of Argument Generation Tasks [62.443665295250035]
We present the results of the AI-Debater 2023 Challenge held by the Chinese Conference on Affect Computing (CCAC 2023) In total, 32 competing teams register for the challenge, from which we received 11 successful submissions.
arXiv Detail & Related papers (2024-07-20T10:13:54Z)
SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan [44.260755521474735]
"SVDD Challenge" is the first research challenge focusing on SVDD for lab-controlled and in-the-wild bonafide and deepfake singing voice recordings. The challenge will be held in conjunction with the 2024 IEEE Spoken Language Technology Workshop (SLT 2024)
arXiv Detail & Related papers (2024-05-08T17:40:12Z)
NTIRE 2024 Quality Assessment of AI-Generated Content Challenge [141.37864527005226]
The challenge is divided into the image track and the video track. The winning methods in both tracks have demonstrated superior prediction performance on AIGC.
arXiv Detail & Related papers (2024-04-25T15:36:18Z)
STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced Audio-Visual Diarization [3.9886149789339327]
This report introduces our novel method named STHG for the Audio-Visual Diarization task of the Ego4D Challenge 2023. Our key innovation is that we model all the speakers in a video using a single, unified heterogeneous graph learning framework. Our final method obtains 61.1% DER on the test set of Ego4D, which significantly outperforms all the baselines as well as last year's winner.
arXiv Detail & Related papers (2023-06-18T17:55:02Z)
Team AcieLee: Technical Report for EPIC-SOUNDS Audio-Based Interaction Recognition Challenge 2023 [8.699868810184752]
The task is to classify the audio caused by interactions between objects, or from events of the camera wearer. We conducted exhaustive experiments and found learning rate step decay, backbone frozen, label smoothing and focal loss contribute most to the performance improvement. This proposed method allowed us to achieve 3rd place in the CVPR 2023 workshop of EPIC-SOUNDS Audio-Based Interaction Recognition Challenge.
arXiv Detail & Related papers (2023-06-15T09:49:07Z)
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge [95.6159736804855]
The VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22) was held in conjunction with INTERSPEECH 2022. The goal of this challenge was to evaluate how well state-of-the-art speaker recognition systems can diarise and recognise speakers from speech obtained "in the wild"
arXiv Detail & Related papers (2023-02-20T19:27:14Z)
NTIRE 2022 Challenge on Stereo Image Super-Resolution: Methods and Results [116.8625268729599]
NTIRE challenge has 1 track aiming at the stereo image super-resolution problem under a standard bicubic degradation. In total, 238 participants were successfully registered, and 21 teams competed in the final testing phase. This challenge establishes a new benchmark for stereo image SR.
arXiv Detail & Related papers (2022-04-20T02:55:37Z)
VoxSRC 2020: The Second VoxCeleb Speaker Recognition Challenge [99.82500204110015]
We held the second installment of the VoxCeleb Speaker Recognition Challenge in conjunction with Interspeech 2020. The goal of this challenge was to assess how well current speaker recognition technology is able to diarise and recognize speakers in unconstrained or in the wild' data. This paper outlines the challenge, and describes the baselines, methods used, and results.
arXiv Detail & Related papers (2020-12-12T17:20:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.