VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select
Indic Languages
- URL: http://arxiv.org/abs/2305.12518v1
- Date: Sun, 21 May 2023 17:23:54 GMT
- Title: VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select
Indic Languages
- Authors: Shivam Mhaskar, Vineet Bhat, Akshay Batheja, Sourabh Deoghare,
Paramveer Choudhary, Pushpak Bhattacharyya
- Abstract summary: Speech-to-Speech Machine Translation (SSMT) system for English-Hindi, English-Marathi, and Hindi-Marathi language pairs.
We develop the SSMT system by cascading Automatic Speech Recognition (ASR), Disfluency Correction (DC), Machine Translation (MT), and Text-to-Speech Synthesis (TTS) models.
On the MT part of the pipeline too, we create a Text-to-Text Machine Translation (TTMT) service in all six translation directions involving English, Hindi, and Marathi.
- Score: 23.76977378957555
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In this work, we present our deployment-ready Speech-to-Speech Machine
Translation (SSMT) system for English-Hindi, English-Marathi, and Hindi-Marathi
language pairs. We develop the SSMT system by cascading Automatic Speech
Recognition (ASR), Disfluency Correction (DC), Machine Translation (MT), and
Text-to-Speech Synthesis (TTS) models. We discuss the challenges faced during
the research and development stage and the scalable deployment of the SSMT
system as a publicly accessible web service. On the MT part of the pipeline
too, we create a Text-to-Text Machine Translation (TTMT) service in all six
translation directions involving English, Hindi, and Marathi. To mitigate data
scarcity, we develop a LaBSE-based corpus filtering tool to select high-quality
parallel sentences from a noisy pseudo-parallel corpus for training the TTMT
system. All the data used for training the SSMT and TTMT systems and the best
models are being made publicly available. Users of our system are (a) Govt. of
India in the context of its new education policy (NEP), (b) tourists who
criss-cross the multilingual landscape of India, (c) Indian Judiciary where a
leading cause of the pendency of cases (to the order of 10 million as on date)
is the translation of case papers, (d) farmers who need weather and price
information and so on. We also share the feedback received from various
stakeholders when our SSMT and TTMT systems were demonstrated in large public
events.
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving [61.73180469072787]
We focus on the problem of spoken translation (ST) of code-switched speech in Indian languages to English text.
We present a new end-to-end model architecture COSTA that scaffolds on pretrained automatic speech recognition (ASR) and machine translation (MT) modules.
COSTA significantly outperforms many competitive cascaded and end-to-end multimodal baselines by up to 3.5 BLEU points.
arXiv Detail & Related papers (2024-06-16T16:10:51Z) - MunTTS: A Text-to-Speech System for Mundari [18.116359188623832]
We present MunTTS, an end-to-end text-to-speech (TTS) system specifically for Mundari, a low-resource Indian language of the Austo-Asiatic family.
Our work addresses the gap in linguistic technology for underrepresented languages by collecting and processing data to build a speech synthesis system.
arXiv Detail & Related papers (2024-01-28T06:27:17Z) - Selective Data Augmentation for Robust Speech Translation [17.56859840101276]
We propose an e2e architecture for English-Hindi (en-hi) ST.
We use two imperfect machine translation (MT) services to translate Libri-trans en text into hi text.
We show that this results in better ST (BLEU) score compared to brute force augmentation of MT data.
arXiv Detail & Related papers (2023-03-22T19:36:07Z) - End-to-End Speech Translation of Arabic to English Broadcast News [2.375764121997739]
Speech translation (ST) is the task of translating acoustic speech signals in a source language into text in a foreign language.
This paper presents our efforts towards the development of the first Broadcast News end-to-end Arabic to English speech translation system.
arXiv Detail & Related papers (2022-12-11T11:35:46Z) - A Bilingual Parallel Corpus with Discourse Annotations [82.07304301996562]
This paper describes BWB, a large parallel corpus first introduced in Jiang et al. (2022), along with an annotated test set.
The BWB corpus consists of Chinese novels translated by experts into English, and the annotated test set is designed to probe the ability of machine translation systems to model various discourse phenomena.
arXiv Detail & Related papers (2022-10-26T12:33:53Z) - SJTU-NICT's Supervised and Unsupervised Neural Machine Translation
Systems for the WMT20 News Translation Task [111.91077204077817]
We participated in four translation directions of three language pairs: English-Chinese, English-Polish, and German-Upper Sorbian.
Based on different conditions of language pairs, we have experimented with diverse neural machine translation (NMT) techniques.
In our submissions, the primary systems won the first place on English to Chinese, Polish to English, and German to Upper Sorbian translation directions.
arXiv Detail & Related papers (2020-10-11T00:40:05Z) - WeChat Neural Machine Translation Systems for WMT20 [61.03013964996131]
Our system is based on the Transformer with effective variants and the DTMT architecture.
In our experiments, we employ data selection, several synthetic data generation approaches, advanced finetuning approaches and self-bleu based model ensemble.
Our constrained Chinese to English system achieves 36.9 case-sensitive BLEU score, which is the highest among all submissions.
arXiv Detail & Related papers (2020-10-01T08:15:09Z) - An Augmented Translation Technique for low Resource language pair:
Sanskrit to Hindi translation [0.0]
In this work, Zero Shot Translation (ZST) is inspected for a low resource language pair.
The same architecture is tested for Sanskrit to Hindi translation for which data is sparse.
Dimensionality reduction of word embedding is performed to reduce the memory usage for data storage.
arXiv Detail & Related papers (2020-06-09T17:01:55Z) - Neural Machine Translation: Challenges, Progress and Future [62.75523637241876]
Machine translation (MT) is a technique that leverages computers to translate human languages automatically.
neural machine translation (NMT) models direct mapping between source and target languages with deep neural networks.
This article makes a review of NMT framework, discusses the challenges in NMT and introduces some exciting recent progresses.
arXiv Detail & Related papers (2020-04-13T07:53:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.