Related papers: $k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models

$k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models

URL: http://arxiv.org/abs/2302.10879v1
Date: Tue, 21 Feb 2023 18:54:21 GMT
Title: $k$NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Authors: Yangsibo Huang, Daogao Liu, Zexuan Zhong, Weijia Shi, Yin Tat Lee
Abstract summary: $k$NN-Adapter is a method to adapt large language models to a new domain. Experiments on four different domains demonstrate that $k$NN-Adapter significantly improves perplexity.
Score: 18.969047541720123
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fine-tuning a language model on a new domain is standard practice for domain adaptation. However, it can be infeasible when it comes to modern large-scale language models such as GPT-3, which can only be accessed through APIs, making it difficult to access the internal parameters of the model. In this paper, we propose $k$NN-Adapter, a method to effectively adapt these black-box large language models (LLMs) to a new domain. The $k$NN-Adapter builds on top of the retrieval-augmented language model, and adaptively learns to interpolate the output of the language model with retrieval results from a datastore consisting of the target domain data. Our experiments on four different domains demonstrate that $k$NN-Adapter significantly improves perplexity, and works particularly well in settings with limited access to LLMs. Additionally, we show that $k$NN-Adapter is more effective than fine-tuning when the amount of training data is limited. We also release a dataset to encourage further study.

Related papers

Generative Adapter: Contextualizing Language Models in Parameters with A Single Forward Pass [109.34230156454574]
Large language models (LMs) are typically adapted to improve performance on new contexts. fine-tuning incurs significant training cost and prompting increases inference overhead. We introduce $GenerativeAdapter$, an effective and efficient adaptation method that directly maps new contexts to low-rank LM adapters.
arXiv Detail & Related papers (2024-11-08T00:42:47Z)
Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases [15.964849180459675]
$k$NM-LM is a retrieval-augmented language model that integrates domain knowledge into language models without fine-tuning. Our approach is able to automatically adapt to different language models and domains.
arXiv Detail & Related papers (2023-08-18T05:25:55Z)
$m^4Adapter$: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter [128.69723410769586]
Multilingual neural machine translation models (MNMT) yield state-of-the-art performance when evaluated on data from a domain and language pair. When a MNMT model is used to translate under domain shift or to a new language pair, performance drops dramatically. We propose $m4Adapter$, which combines domain and language knowledge using meta-learning with adapters.
arXiv Detail & Related papers (2022-10-21T12:25:05Z)
Continual Learning in Multilingual NMT via Language-Specific Embeddings [92.91823064720232]
It consists in replacing the shared vocabulary with a small language-specific vocabulary and fine-tuning the new embeddings on the new language's parallel data. Because the parameters of the original model are not modified, its performance on the initial languages does not degrade.
arXiv Detail & Related papers (2021-10-20T10:38:57Z)
Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters [66.7986513246294]
We study the compositionality of language and domain adapters in the context of Machine Translation. We find that in the partial resource scenario a naive combination of domain-specific and language-specific adapters often results in catastrophic forgetting' of the missing languages.
arXiv Detail & Related papers (2021-10-18T18:55:23Z)
Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation [61.27321597981737]
$k$NN-MT has shown the promising capability of directly incorporating the pre-trained neural machine translation (NMT) model with domain-specific token-level $k$-nearest-neighbor retrieval. We propose a novel framework that directly uses in-domain monolingual sentences in the target language to construct an effective datastore for $k$-nearest-neighbor retrieval.
arXiv Detail & Related papers (2021-09-14T11:50:01Z)
Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network [0.0]
We show that RNN-transducer models can be effectively adapted to new domains using only small amounts of textual data. We show with multiple ASR evaluation tasks how this method can provide relative gains of 10-45% in target task WER.
arXiv Detail & Related papers (2021-04-22T15:21:41Z)
Domain Adaptation in Dialogue Systems using Transfer and Meta-Learning [12.64591916699374]
Current generative-based dialogue systems fail to adapt to new unseen domains when only a small amount of target data is available. We propose a method that adapts to unseen domains by combining both transfer and meta-learning.
arXiv Detail & Related papers (2021-02-22T16:16:57Z)
A Simple Baseline to Semi-Supervised Domain Adaptation for Machine Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data. We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT. This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.