Leveraging Domain Knowledge at Inference Time for LLM Translation: Retrieval versus Generation
- URL: http://arxiv.org/abs/2503.05010v1
- Date: Thu, 06 Mar 2025 22:23:07 GMT
- Title: Leveraging Domain Knowledge at Inference Time for LLM Translation: Retrieval versus Generation
- Authors: Bryan Li, Jiaming Luo, Eleftheria Briakou, Colin Cherry,
- Abstract summary: Large language models (LLMs) have been increasingly adopted for machine translation (MT)<n>Our work studies domain-adapted MT with LLMs through a careful prompting setup.<n>We find that demonstrations consistently outperform terminology, and retrieval consistently outperforms generation.
- Score: 36.41708236431343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While large language models (LLMs) have been increasingly adopted for machine translation (MT), their performance for specialist domains such as medicine and law remains an open challenge. Prior work has shown that LLMs can be domain-adapted at test-time by retrieving targeted few-shot demonstrations or terminologies for inclusion in the prompt. Meanwhile, for general-purpose LLM MT, recent studies have found some success in generating similarly useful domain knowledge from an LLM itself, prior to translation. Our work studies domain-adapted MT with LLMs through a careful prompting setup, finding that demonstrations consistently outperform terminology, and retrieval consistently outperforms generation. We find that generating demonstrations with weaker models can close the gap with larger model's zero-shot performance. Given the effectiveness of demonstrations, we perform detailed analyses to understand their value. We find that domain-specificity is particularly important, and that the popular multi-domain benchmark is testing adaptation to a particular writing style more so than to a specific domain.
Related papers
- Exploring How LLMs Capture and Represent Domain-Specific Knowledge [16.84031546207366]
We study whether Large Language Models (LLMs) inherently capture domain-specific nuances in natural language.
Our experiments probe the domain sensitivity of LLMs by examining their ability to distinguish queries from different domains.
We reveal latent domain-related trajectories that indicate the model's internal recognition of query domains.
arXiv Detail & Related papers (2025-04-23T16:46:06Z) - Large Language Model for Multi-Domain Translation: Benchmarking and Domain CoT Fine-tuning [55.107329995417786]
Large language models (LLMs) have demonstrated impressive general understanding and generation abilities.
We establish a benchmark for multi-domain translation, featuring 25 German$Leftrightarrow$English and 22 Chinese$Leftrightarrow$English test sets.
We propose a domain Chain of Thought (CoT) fine-tuning technique that utilizes the intrinsic multi-domain intelligence of LLMs to improve translation performance.
arXiv Detail & Related papers (2024-10-03T16:15:04Z) - Exploring Language Model Generalization in Low-Resource Extractive QA [57.14068405860034]
We investigate Extractive Question Answering (EQA) with Large Language Models (LLMs) under domain drift.<n>We devise a series of experiments to explain the performance gap empirically.
arXiv Detail & Related papers (2024-09-27T05:06:43Z) - LAMPO: Large Language Models as Preference Machines for Few-shot Ordinal Classification [34.9210323553677]
We introduce LAMPO, a novel paradigm that leverages Large Language Models (LLMs) for solving few-shot multi-class ordinal classification tasks.
Extensive experiments on seven public datasets demonstrate LAMPO's remarkably competitive performance across a diverse spectrum of applications.
arXiv Detail & Related papers (2024-08-06T15:55:05Z) - Aligning Language Models with Demonstrated Feedback [58.834937450242975]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors.
We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z) - Analyzing the Role of Semantic Representations in the Era of Large Language Models [104.18157036880287]
We investigate the role of semantic representations in the era of large language models (LLMs)
We propose an AMR-driven chain-of-thought prompting method, which we call AMRCoT.
We find that it is difficult to predict which input examples AMR may help or hurt on, but errors tend to arise with multi-word expressions.
arXiv Detail & Related papers (2024-05-02T17:32:59Z) - BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models [56.89958793648104]
Large Language Models (LLMs) are versatile and capable of addressing a diverse range of tasks.
Previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs.
We present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models.
arXiv Detail & Related papers (2024-03-27T08:57:21Z) - Linguistic Intelligence in Large Language Models for Telecommunications [5.06945923921948]
Large Language Models (LLMs) have emerged as a significant advancement in the field of Natural Language Processing (NLP)
This study seeks to evaluate the knowledge and understanding capabilities of LLMs within the telecommunications domain.
Our evaluation reveals that zero-shot LLMs can achieve performance levels comparable to the current state-of-the-art fine-tuned models.
arXiv Detail & Related papers (2024-02-24T14:01:07Z) - Fine-tuning Large Language Models for Domain-specific Machine Translation [7.977136709446714]
Large language models (LLMs) have shown great potential in domain-specific machine translation (MT)<n>This paper focuses on enhancing the domain-specific MT capability of LLMs by providing high-quality training datasets and proposing a novel fine-tuning framework denoted by DragFT.<n>The results on three domain-specific datasets show that DragFT achieves a significant performance boost and shows superior performance compared to advanced models such as GPT-3.5 and GPT-4o.
arXiv Detail & Related papers (2024-02-23T02:24:15Z) - AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations [52.43593893122206]
Alignedcot is an in-context learning technique for invoking Large Language Models.
It achieves consistent and correct step-wise prompts in zero-shot scenarios.
We conduct experiments on mathematical reasoning and commonsense reasoning.
arXiv Detail & Related papers (2023-11-22T17:24:21Z) - Empower Large Language Model to Perform Better on Industrial
Domain-Specific Question Answering [36.31193273252256]
Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks.
But its performance in real industrial domain-specific scenarios is average due to its lack of specific domain knowledge.
We provide a benchmark Question Answering (QA) dataset named MSQA, centered around Microsoft products and IT technical problems encountered by customers.
arXiv Detail & Related papers (2023-05-19T09:23:25Z) - Large Language Models Are Latent Variable Models: Explaining and Finding
Good Demonstrations for In-Context Learning [104.58874584354787]
In recent years, pre-trained large language models (LLMs) have demonstrated remarkable efficiency in achieving an inference-time few-shot learning capability known as in-context learning.
This study aims to examine the in-context learning phenomenon through a Bayesian lens, viewing real-world LLMs as latent variable models.
arXiv Detail & Related papers (2023-01-27T18:59:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.