Related papers: FFSTC: Fongbe to French Speech Translation Corpus

FFSTC: Fongbe to French Speech Translation Corpus

URL: http://arxiv.org/abs/2403.05488v1
Date: Fri, 8 Mar 2024 17:53:58 GMT
Title: FFSTC: Fongbe to French Speech Translation Corpus
Authors: D. Fortune Kponou, Frejus A. A. Laleye, Eugene C. Ezin
Abstract summary: We introduce the Fongbe to French Speech Translation Corpus (FFSTC) for the first time. This corpus encompasses approximately 31 hours of collected Fongbe language content, featuring both French transcriptions and corresponding Fongbe voice recordings.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we introduce the Fongbe to French Speech Translation Corpus (FFSTC) for the first time. This corpus encompasses approximately 31 hours of collected Fongbe language content, featuring both French transcriptions and corresponding Fongbe voice recordings. FFSTC represents a comprehensive dataset compiled through various collection methods and the efforts of dedicated individuals. Furthermore, we conduct baseline experiments using Fairseq's transformer_s and conformer models to evaluate data quality and validity. Our results indicate a score of 8.96 for the transformer_s model and 8.14 for the conformer model, establishing a baseline for the FFSTC corpus.

Related papers

An Efficient Approach for Machine Translation on Low-resource Languages: A Case Study in Vietnamese-Chinese [1.6932009464531739]
We proposed an approach for machine translation in low-resource languages such as Vietnamese-Chinese. Our proposed method leveraged the power of the multilingual pre-trained language model (mBART) and both Vietnamese and Chinese monolingual corpus.
arXiv Detail & Related papers (2025-01-31T17:11:45Z)
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation [97.54885207518946]
We introduce a novel model framework TransVIP that leverages diverse datasets in a cascade fashion. We propose two separated encoders to preserve the speaker's voice characteristics and isochrony from the source speech during the translation process. Our experiments on the French-English language pair demonstrate that our model outperforms the current state-of-the-art speech-to-speech translation model.
arXiv Detail & Related papers (2024-05-28T04:11:37Z)
The Claire French Dialogue Dataset [9.45456707528025]
This paper describes the 24 individual corpora of which CFDD is composed and provides links and citations to their original sources. It also provides our proposed breakdown of the full CFDD dataset into eight categories of subcorpora and describes the process we followed to standardize the format of the final dataset.
arXiv Detail & Related papers (2023-11-28T14:55:22Z)
FreCDo: A Large Corpus for French Cross-Domain Dialect Identification [22.132457694021184]
We present a novel corpus for French dialect identification comprising 413,522 French text samples. The training, validation and test splits are collected from different news websites. This leads to a French cross-domain (FreCDo) dialect identification task.
arXiv Detail & Related papers (2022-12-15T10:32:29Z)
M3ST: Mix at Three Levels for Speech Translation [66.71994367650461]
We propose Mix at three levels for Speech Translation (M3ST) method to increase the diversity of the augmented training corpus. In the first stage of fine-tuning, we mix the training corpus at three levels, including word level, sentence level and frame level, and fine-tune the entire model with mixed data. Experiments on MuST-C speech translation benchmark and analysis show that M3ST outperforms current strong baselines and achieves state-of-the-art results on eight directions with an average BLEU of 29.9.
arXiv Detail & Related papers (2022-12-07T14:22:00Z)
Cross-modal Contrastive Learning for Speech Translation [36.63604508886932]
ConST is a cross-modal contrastive learning method for end-to-end speech-to-text translation. Experiments show that the proposed ConST consistently outperforms the previous methods on. Its learned representation improves the accuracy of cross-modal speech-text retrieval from 4% to 88%.
arXiv Detail & Related papers (2022-05-05T05:14:01Z)
ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality Estimation and Corrective Feedback [70.5469946314539]
ChrEnTranslate is an online machine translation demonstration system for translation between English and an endangered language Cherokee. It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability.
arXiv Detail & Related papers (2021-07-30T17:58:54Z)
BSTC: A Large-Scale Chinese-English Speech Translation Dataset [26.633433687767553]
BSTC (Baidu Speech Translation Corpus) is a large-scale Chinese-English speech translation dataset. This dataset is constructed based on a collection of licensed videos of talks or lectures, including about 68 hours of Mandarin data. We have asked three experienced interpreters to simultaneously interpret the testing talks in a mock conference setting.
arXiv Detail & Related papers (2021-04-08T07:38:51Z)
Consecutive Decoding for Speech-to-text Translation [51.155661276936044]
COnSecutive Transcription and Translation (COSTT) is an integral approach for speech-to-text translation. The key idea is to generate source transcript and target translation text with a single decoder. Our method is verified on three mainstream datasets.
arXiv Detail & Related papers (2020-09-21T10:10:45Z)
Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN [81.79070894458322]
Cross-lingual voice conversion aims to change source speaker's voice to sound like that of target speaker, when source and target speakers speak different languages. Previous studies on cross-lingual voice conversion mainly focus on spectral conversion with a linear transformation for F0 transfer. We propose the use of continuous wavelet transform (CWT) decomposition for F0 modeling. CWT provides a way to decompose a signal into different temporal scales that explain prosody in different time resolutions.
arXiv Detail & Related papers (2020-08-11T07:29:55Z)
CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus [57.641761472372814]
CoVoST is a multilingual speech-to-text translation corpus from 11 languages into English. It diversified with over 11,000 speakers and over 60 accents. CoVoST is released under CC0 license and free to use.
arXiv Detail & Related papers (2020-02-04T14:35:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.