ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024
- URL: http://arxiv.org/abs/2407.12038v2
- Date: Wed, 31 Jul 2024 14:23:00 GMT
- Title: ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024
- Authors: Ruibo Fu, Rui Liu, Chunyu Qiang, Yingming Gao, Yi Lu, Shuchen Shi, Tao Wang, Ya Li, Zhengqi Wen, Chen Zhang, Hui Bu, Yukun Liu, Xin Qi, Guanjun Li,
- Abstract summary: TheICAGC 2024 challenge aims to enhance the persuasiveness and acceptability of synthesized audio.
A total of 19 teams have registered for the challenge, and the results of the competition and the competition are described in this paper.
- Score: 32.96984318966757
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The Inspirational and Convincing Audio Generation Challenge 2024 (ICAGC 2024) is part of the ISCSLP 2024 Competitions and Challenges track. While current text-to-speech (TTS) technology can generate high-quality audio, its ability to convey complex emotions and controlled detail content remains limited. This constraint leads to a discrepancy between the generated audio and human subjective perception in practical applications like companion robots for children and marketing bots. The core issue lies in the inconsistency between high-quality audio generation and the ultimate human subjective experience. Therefore, this challenge aims to enhance the persuasiveness and acceptability of synthesized audio, focusing on human alignment convincing and inspirational audio generation. A total of 19 teams have registered for the challenge, and the results of the competition and the competition are described in this paper.
Related papers
- The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings [18.994388357437924]
The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge aims to benchmark and advance zero-shot spontaneous style voice cloning.
This paper details the data, tracks, submitted systems, evaluation results, and findings.
arXiv Detail & Related papers (2024-10-31T09:39:49Z) - Overview of AI-Debater 2023: The Challenges of Argument Generation Tasks [62.443665295250035]
We present the results of the AI-Debater 2023 Challenge held by the Chinese Conference on Affect Computing (CCAC 2023)
In total, 32 competing teams register for the challenge, from which we received 11 successful submissions.
arXiv Detail & Related papers (2024-07-20T10:13:54Z) - SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan [44.260755521474735]
"SVDD Challenge" is the first research challenge focusing on SVDD for lab-controlled and in-the-wild bonafide and deepfake singing voice recordings.
The challenge will be held in conjunction with the 2024 IEEE Spoken Language Technology Workshop (SLT 2024)
arXiv Detail & Related papers (2024-05-08T17:40:12Z) - NTIRE 2024 Quality Assessment of AI-Generated Content Challenge [141.37864527005226]
The challenge is divided into the image track and the video track.
The winning methods in both tracks have demonstrated superior prediction performance on AIGC.
arXiv Detail & Related papers (2024-04-25T15:36:18Z) - STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced
Audio-Visual Diarization [3.9886149789339327]
This report introduces our novel method named STHG for the Audio-Visual Diarization task of the Ego4D Challenge 2023.
Our key innovation is that we model all the speakers in a video using a single, unified heterogeneous graph learning framework.
Our final method obtains 61.1% DER on the test set of Ego4D, which significantly outperforms all the baselines as well as last year's winner.
arXiv Detail & Related papers (2023-06-18T17:55:02Z) - Team AcieLee: Technical Report for EPIC-SOUNDS Audio-Based Interaction
Recognition Challenge 2023 [8.699868810184752]
The task is to classify the audio caused by interactions between objects, or from events of the camera wearer.
We conducted exhaustive experiments and found learning rate step decay, backbone frozen, label smoothing and focal loss contribute most to the performance improvement.
This proposed method allowed us to achieve 3rd place in the CVPR 2023 workshop of EPIC-SOUNDS Audio-Based Interaction Recognition Challenge.
arXiv Detail & Related papers (2023-06-15T09:49:07Z) - VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge [95.6159736804855]
The VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22) was held in conjunction with INTERSPEECH 2022.
The goal of this challenge was to evaluate how well state-of-the-art speaker recognition systems can diarise and recognise speakers from speech obtained "in the wild"
arXiv Detail & Related papers (2023-02-20T19:27:14Z) - NTIRE 2022 Challenge on Stereo Image Super-Resolution: Methods and
Results [116.8625268729599]
NTIRE challenge has 1 track aiming at the stereo image super-resolution problem under a standard bicubic degradation.
In total, 238 participants were successfully registered, and 21 teams competed in the final testing phase.
This challenge establishes a new benchmark for stereo image SR.
arXiv Detail & Related papers (2022-04-20T02:55:37Z) - VoxSRC 2020: The Second VoxCeleb Speaker Recognition Challenge [99.82500204110015]
We held the second installment of the VoxCeleb Speaker Recognition Challenge in conjunction with Interspeech 2020.
The goal of this challenge was to assess how well current speaker recognition technology is able to diarise and recognize speakers in unconstrained or in the wild' data.
This paper outlines the challenge, and describes the baselines, methods used, and results.
arXiv Detail & Related papers (2020-12-12T17:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.