A Theory of Unsupervised Translation Motivated by Understanding Animal
Communication
- URL: http://arxiv.org/abs/2211.11081v2
- Date: Fri, 3 Nov 2023 18:15:59 GMT
- Title: A Theory of Unsupervised Translation Motivated by Understanding Animal
Communication
- Authors: Shafi Goldwasser, David F. Gruber, Adam Tauman Kalai, Orr Paradise
- Abstract summary: We propose a theoretical framework for analyzing Unsupervised Machine Translation.
We show that the error rates are inversely related to the language complexity and amount of common ground.
This suggests that unsupervised translation of animal communication may be feasible if the communication system is sufficiently complex.
- Score: 7.748040467625809
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks are capable of translating between languages -- in some cases
even between two languages where there is little or no access to parallel
translations, in what is known as Unsupervised Machine Translation (UMT). Given
this progress, it is intriguing to ask whether machine learning tools can
ultimately enable understanding animal communication, particularly that of
highly intelligent animals. We propose a theoretical framework for analyzing
UMT when no parallel translations are available and when it cannot be assumed
that the source and target corpora address related subject domains or posses
similar linguistic structure. We exemplify this theory with two stylized models
of language, for which our framework provides bounds on necessary sample
complexity; the bounds are formally proven and experimentally verified on
synthetic data. These bounds show that the error rates are inversely related to
the language complexity and amount of common ground. This suggests that
unsupervised translation of animal communication may be feasible if the
communication system is sufficiently complex.
Related papers
- On Non-interactive Evaluation of Animal Communication Translators [8.958679534486855]
This is an instance of machine translation quality evaluation (MTQE) without any reference translations available.<n>The idea is to translate animal communication, turn by turn, and evaluate how often the resulting translations make more sense in order than permuted.
arXiv Detail & Related papers (2025-10-17T15:56:30Z) - Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models [56.61984030508691]
We present the first mechanistic interpretability study of language confusion.<n>We show that confusion points (CPs) are central to this phenomenon.<n>We show that editing a small set of critical neurons, identified via comparative analysis with a multilingual-tuned counterpart, substantially mitigates confusion.
arXiv Detail & Related papers (2025-05-22T11:29:17Z) - A Context-aware Framework for Translation-mediated Conversations [29.169155271343083]
We present a framework to improve large language model-based translation systems by incorporating contextual information in bilingual conversational settings.
We validate both components of our framework on two task-oriented domains: customer chat and user-assistant interaction.
Our framework consistently results in better translations than state-of-the-art systems like GPT-4o and TowerInstruct.
arXiv Detail & Related papers (2024-12-05T14:41:05Z) - Training Neural Networks as Recognizers of Formal Languages [87.06906286950438]
Formal language theory pertains specifically to recognizers.
It is common to instead use proxy tasks that are similar in only an informal sense.
We correct this mismatch by training and evaluating neural networks directly as binary classifiers of strings.
arXiv Detail & Related papers (2024-11-11T16:33:25Z) - The Impact of Syntactic and Semantic Proximity on Machine Translation with Back-Translation [7.557957450498644]
We conduct experiments with artificial languages to determine what properties of languages make back-translation an effective training method.
We find, contrary to popular belief, that (i) parallel word frequency distributions, (ii) partially shared vocabulary, and (iii) similar syntactic structure across languages are not sufficient to explain the success of back-translation.
We conjecture that rich semantic dependencies, parallel across languages, are at the root of the success of unsupervised methods based on back-translation.
arXiv Detail & Related papers (2024-03-26T18:38:14Z) - Mitigating Data Imbalance and Representation Degeneration in
Multilingual Machine Translation [103.90963418039473]
Bi-ACL is a framework that uses only target-side monolingual data and a bilingual dictionary to improve the performance of the MNMT model.
We show that Bi-ACL is more effective both in long-tail languages and in high-resource languages.
arXiv Detail & Related papers (2023-05-22T07:31:08Z) - Syntax and Domain Aware Model for Unsupervised Program Translation [23.217899398362206]
We propose SDA-Trans, a syntax and domain-aware model for program translation.
It leverages the syntax structure and domain knowledge to enhance the cross-lingual transfer ability.
The experimental results on function translation tasks between Python, Java, and C++ show that SDA-Trans outperforms many large-scale pre-trained models.
arXiv Detail & Related papers (2023-02-08T06:54:55Z) - Adaptive Machine Translation with Large Language Models [7.803471587734353]
We investigate how we can utilize in-context learning to improve real-time adaptive machine translation.
We conduct experiments across five diverse language pairs, namely English-to-Arabic (EN-AR), English-to-Chinese (EN-ZH), English-to-French (EN-FR), English-to-Kinyarwanda (EN-RW), and English-to-Spanish (EN-ES)
arXiv Detail & Related papers (2023-01-30T21:17:15Z) - Zero-Shot Cross-lingual Semantic Parsing [56.95036511882921]
We study cross-lingual semantic parsing as a zero-shot problem without parallel data for 7 test languages.
We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-Logical form paired data.
Our system frames zero-shot parsing as a latent-space alignment problem and finds that pre-trained models can be improved to generate logical forms with minimal cross-lingual transfer penalty.
arXiv Detail & Related papers (2021-04-15T16:08:43Z) - On Learning Language-Invariant Representations for Universal Machine
Translation [33.40094622605891]
Universal machine translation aims to learn to translate between any pair of languages.
We prove certain impossibilities of this endeavour in general and prove positive results in the presence of additional (but natural) structure of data.
We believe our theoretical insights and implications contribute to the future algorithmic design of universal machine translation.
arXiv Detail & Related papers (2020-08-11T04:45:33Z) - It's Easier to Translate out of English than into it: Measuring Neural
Translation Difficulty by Cross-Mutual Information [90.35685796083563]
Cross-mutual information (XMI) is an asymmetric information-theoretic metric of machine translation difficulty.
XMI exploits the probabilistic nature of most neural machine translation models.
We present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems.
arXiv Detail & Related papers (2020-05-05T17:38:48Z) - Bridging Linguistic Typology and Multilingual Machine Translation with
Multi-View Language Representations [83.27475281544868]
We use singular vector canonical correlation analysis to study what kind of information is induced from each source.
We observe that our representations embed typology and strengthen correlations with language relationships.
We then take advantage of our multi-view language vector space for multilingual machine translation, where we achieve competitive overall translation accuracy.
arXiv Detail & Related papers (2020-04-30T16:25:39Z) - Bootstrapping a Crosslingual Semantic Parser [74.99223099702157]
We adapt a semantic trained on a single language, such as English, to new languages and multiple domains with minimal annotation.
We query if machine translation is an adequate substitute for training data, and extend this to investigate bootstrapping using joint training with English, paraphrasing, and multilingual pre-trained models.
arXiv Detail & Related papers (2020-04-06T12:05:02Z) - Urdu-English Machine Transliteration using Neural Networks [0.0]
We present transliteration technique based on Expectation Maximization (EM) which is un-supervised and language independent.
System learns the pattern and out-of-vocabulary words from parallel corpus and there is no need to train it on transliteration corpus explicitly.
arXiv Detail & Related papers (2020-01-12T17:30:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.