Related papers: NeurST: Neural Speech Translation Toolkit

NeurST: Neural Speech Translation Toolkit

URL: http://arxiv.org/abs/2012.10018v1
Date: Fri, 18 Dec 2020 02:33:58 GMT
Title: NeurST: Neural Speech Translation Toolkit
Authors: Chengqi Zhao and Mingxuan Wang and Lei Li
Abstract summary: NeurST is an open-source toolkit for neural speech translation developed by ByteDance AI Lab. It mainly focuses on end-to-end speech translation, which is easy to use, modify, and extend to advanced speech translation research and products.
Score: 13.68036533544182
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: NeurST is an open-source toolkit for neural speech translation developed by ByteDance AI Lab. The toolkit mainly focuses on end-to-end speech translation, which is easy to use, modify, and extend to advanced speech translation research and products. NeurST aims at facilitating the speech translation research for NLP researchers and provides a complete setup for speech translation benchmarks, including feature extraction, data preprocessing, distributed training, and evaluation. Moreover, The toolkit implements several major architectures for end-to-end speech translation. It shows experimental results for different benchmark datasets, which can be regarded as reliable baselines for future research. The toolkit is publicly available at https://github.com/bytedance/neurst.

Related papers

Data Augmentation With Back translation for Low Resource languages: A case of English and Luganda [0.0]
We explore the application of Back translation as a semi-supervised technique to enhance Neural Machine Translation models for the English-Luganda language pair.<n>Our methodology involves developing custom NMT models using both publicly available and web-crawled data, and applying Iterative and Incremental Back translation techniques.<n>The results of our study show significant improvements, with translation performance for the English-Luganda pair exceeding previous benchmarks by more than 10 BLEU score units across all translation directions.
arXiv Detail & Related papers (2025-05-05T08:47:52Z)
Hindi to English: Transformer-Based Neural Machine Translation [0.0]
We have developed a Machine Translation (NMT) system by training the Transformer model to translate texts from Indian Language Hindi to English. We implemented back-translation to augment the training data and for creating the vocabulary. This led us to achieve a state-of-the-art BLEU score of 24.53 on the test set of IIT Bombay English-Hindi Corpus.
arXiv Detail & Related papers (2023-09-23T00:00:09Z)
Speech-to-Speech Translation For A Real-world Unwritten Language [62.414304258701804]
We study speech-to-speech translation (S2ST) that translates speech from one language into another language. We present an end-to-end solution from training data collection, modeling choices to benchmark dataset release.
arXiv Detail & Related papers (2022-11-11T20:21:38Z)
SpeechBrain: A General-Purpose Speech Toolkit [73.0404642815335]
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the research and development of neural speech processing technologies. It achieves competitive or state-of-the-art performance in a wide range of speech benchmarks.
arXiv Detail & Related papers (2021-06-08T18:22:56Z)
Continual Mixed-Language Pre-Training for Extremely Low-Resource Neural Machine Translation [53.22775597051498]
We present a continual pre-training framework on mBART to effectively adapt it to unseen languages. Results show that our method can consistently improve the fine-tuning performance upon the mBART baseline. Our approach also boosts the performance on translation pairs where both languages are seen in the original mBART's pre-training.
arXiv Detail & Related papers (2021-05-09T14:49:07Z)
The Multilingual TEDx Corpus for Speech Recognition and Translation [30.993199499048824]
We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and speech translation (ST) research across many non-English source languages. The corpus is a collection of audio recordings from TEDx talks in 8 source languages. We segment transcripts into sentences and align them to the source-language audio and target-language translations.
arXiv Detail & Related papers (2021-02-02T21:16:25Z)
Consecutive Decoding for Speech-to-text Translation [51.155661276936044]
COnSecutive Transcription and Translation (COSTT) is an integral approach for speech-to-text translation. The key idea is to generate source transcript and target translation text with a single decoder. Our method is verified on three mainstream datasets.
arXiv Detail & Related papers (2020-09-21T10:10:45Z)
"Listen, Understand and Translate": Triple Supervision Decouples End-to-end Speech-to-text Translation [49.610188741500274]
An end-to-end speech-to-text translation (ST) takes audio in a source language and outputs the text in a target language. Existing methods are limited by the amount of parallel corpus. We build a system to fully utilize signals in a parallel ST corpus.
arXiv Detail & Related papers (2020-09-21T09:19:07Z)
An Augmented Translation Technique for low Resource language pair: Sanskrit to Hindi translation [0.0]
In this work, Zero Shot Translation (ZST) is inspected for a low resource language pair. The same architecture is tested for Sanskrit to Hindi translation for which data is sparse. Dimensionality reduction of word embedding is performed to reduce the memory usage for data storage.
arXiv Detail & Related papers (2020-06-09T17:01:55Z)
ESPnet-ST: All-in-One Speech Translation Toolkit [57.76342114226599]
ESPnet-ST is a new project inside end-to-end speech processing toolkit, ESPnet. It implements automatic speech recognition, machine translation, and text-to-speech functions for speech translation. We provide all-in-one recipes including data pre-processing, feature extraction, training, and decoding pipelines.
arXiv Detail & Related papers (2020-04-21T18:38:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.