Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison
- URL: http://arxiv.org/abs/2404.02835v1
- Date: Wed, 3 Apr 2024 16:13:29 GMT
- Title: Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison
- Authors: Maxime Bouthors, Josep Crego, Francois Yvon,
- Abstract summary: We study the effect of varying retrieval methods for several translation architectures.
Our experiments show that the choice of the retrieval technique impacts the translation scores, with variance across architectures.
We also discuss the effects of increasing the number and diversity of examples, which are mostly positive across the board.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval-Augmented Neural Machine Translation (RAMT) architectures retrieve examples from memory to guide the generation process. While most works in this trend explore new ways to exploit the retrieved examples, the upstream retrieval step is mostly unexplored. In this paper, we study the effect of varying retrieval methods for several translation architectures, to better understand the interplay between these two processes. We conduct experiments in two language pairs in a multi-domain setting and consider several downstream architectures based on a standard autoregressive model, an edit-based model, and a large language model with in-context learning. Our experiments show that the choice of the retrieval technique impacts the translation scores, with variance across architectures. We also discuss the effects of increasing the number and diversity of examples, which are mostly positive across the board.
Related papers
- Binary Code Similarity Detection via Graph Contrastive Learning on Intermediate Representations [52.34030226129628]
Binary Code Similarity Detection (BCSD) plays a crucial role in numerous fields, including vulnerability detection, malware analysis, and code reuse identification.
In this paper, we propose IRBinDiff, which mitigates compilation differences by leveraging LLVM-IR with higher-level semantic abstraction.
Our extensive experiments, conducted under varied compilation settings, demonstrate that IRBinDiff outperforms other leading BCSD methods in both One-to-one comparison and One-to-many search scenarios.
arXiv Detail & Related papers (2024-10-24T09:09:20Z) - Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting
Pre-trained Language Models [22.977852629450346]
We propose a method that combines two popular research areas by injecting linguistic structures into pre-trained language models.
In our approach, parallel adapter modules encoding different linguistic structures are combined using a novel Mixture-of-Linguistic-Experts architecture.
Our experiment results show that our approach can outperform state-of-the-art PEFT methods with a comparable number of parameters.
arXiv Detail & Related papers (2023-10-24T23:29:06Z) - End-to-End Lip Reading in Romanian with Cross-Lingual Domain Adaptation
and Lateral Inhibition [2.839471733237535]
We analyze several architectures and optimizations on the underrepresented, short-scale Romanian language dataset called Wild LRRo.
We obtain state-of-the-art results using our proposed method, namely cross-lingual domain adaptation and unlabeled videos.
We also assess the performance of adding a layer inspired by the neural inhibition mechanism.
arXiv Detail & Related papers (2023-10-07T15:36:58Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Improving Retrieval Augmented Neural Machine Translation by Controlling
Source and Fuzzy-Match Interactions [15.845071122977158]
We build on the idea of Retrieval Augmented Translation (RAT) where top-k in-domain fuzzy matches are found for the source sentence.
We propose a novel architecture to control interactions between a source sentence and the top-k fuzzy target-language matches.
arXiv Detail & Related papers (2022-10-10T23:33:15Z) - Examining Scaling and Transfer of Language Model Architectures for
Machine Translation [51.69212730675345]
Language models (LMs) process sequences in a single stack of layers, and encoder-decoder models (EncDec) utilize separate layer stacks for input and output processing.
In machine translation, EncDec has long been the favoured approach, but with few studies investigating the performance of LMs.
arXiv Detail & Related papers (2022-02-01T16:20:15Z) - Comparing Feature-Engineering and Feature-Learning Approaches for
Multilingual Translationese Classification [11.364204162881482]
We compare the traditional feature-engineering-based approach to the feature-learning-based one.
We investigate how well the hand-crafted features explain the variance in the neural models' predictions.
arXiv Detail & Related papers (2021-09-15T22:34:48Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z) - Ranking Enhanced Dialogue Generation [77.8321855074999]
How to effectively utilize the dialogue history is a crucial problem in multi-turn dialogue generation.
Previous works usually employ various neural network architectures to model the history.
This paper proposes a Ranking Enhanced Dialogue generation framework.
arXiv Detail & Related papers (2020-08-13T01:49:56Z) - DARTS-ASR: Differentiable Architecture Search for Multilingual Speech
Recognition and Adaptation [64.44349061520671]
In this paper, we propose an ASR approach with efficient gradient-based architecture search, DARTS-ASR.
In order to examine the generalizability of DARTS-ASR, we apply our approach not only on many languages to perform monolingual ASR, but also on a multilingual ASR setting.
arXiv Detail & Related papers (2020-05-13T11:32:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.