Unsupervised Domain Adaption for Neural Information Retrieval
- URL: http://arxiv.org/abs/2310.09350v1
- Date: Fri, 13 Oct 2023 18:27:33 GMT
- Title: Unsupervised Domain Adaption for Neural Information Retrieval
- Authors: Carlos Dominguez, Jon Ander Campos, Eneko Agirre, Gorka Azkune
- Abstract summary: We compare synthetic annotation by query generation using Large Language Models or rule-based string manipulation.
We find that Large Language Models outperform rule-based methods in all scenarios by a large margin.
In addition we explore several sizes of open Large Language Models to generate synthetic data and find that a medium-sized model suffices.
- Score: 18.97486314518283
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Neural information retrieval requires costly annotated data for each target
domain to be competitive. Synthetic annotation by query generation using Large
Language Models or rule-based string manipulation has been proposed as an
alternative, but their relative merits have not been analysed. In this paper,
we compare both methods head-to-head using the same neural IR architecture. We
focus on the BEIR benchmark, which includes test datasets from several domains
with no training data, and explore two scenarios: zero-shot, where the
supervised system is trained in a large out-of-domain dataset (MS-MARCO); and
unsupervised domain adaptation, where, in addition to MS-MARCO, the system is
fine-tuned in synthetic data from the target domain. Our results indicate that
Large Language Models outperform rule-based methods in all scenarios by a large
margin, and, more importantly, that unsupervised domain adaptation is effective
compared to applying a supervised IR system in a zero-shot fashion. In addition
we explore several sizes of open Large Language Models to generate synthetic
data and find that a medium-sized model suffices. Code and models are publicly
available for reproducibility.
Related papers
- Fine-Tuning or Fine-Failing? Debunking Performance Myths in Large Language Models [0.8399688944263842]
Large Language Models (LLMs) have the capability to understand and generate human-like text from input queries.
This study extends this concept to the integration of LLMs within Retrieval-Augmented Generation (RAG) pipelines.
We evaluate the impact of fine-tuning on the LLMs' capacity for data extraction and contextual understanding.
arXiv Detail & Related papers (2024-06-17T04:35:17Z) - Non-stationary Domain Generalization: Theory and Algorithm [11.781050299571692]
In this paper, we study domain generalization in non-stationary environment.
We first examine the impact of environmental non-stationarity on model performance.
Then, we propose a novel algorithm based on adaptive invariant representation learning.
arXiv Detail & Related papers (2024-05-10T21:32:43Z) - One-Shot Domain Adaptive and Generalizable Semantic Segmentation with
Class-Aware Cross-Domain Transformers [96.51828911883456]
Unsupervised sim-to-real domain adaptation (UDA) for semantic segmentation aims to improve the real-world test performance of a model trained on simulated data.
Traditional UDA often assumes that there are abundant unlabeled real-world data samples available during training for the adaptation.
We explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization problem, where only one real-world data sample is available.
arXiv Detail & Related papers (2022-12-14T15:54:15Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available.
This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets.
We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Learning causal representations for robust domain adaptation [31.261956776418618]
In many real-world applications, target domain data may not always be available.
In this paper, we study the cases where at the training phase the target domain data is unavailable.
We propose a novel Causal AutoEncoder (CAE), which integrates deep autoencoder and causal structure learning into a unified model.
arXiv Detail & Related papers (2020-11-12T11:24:03Z) - Unsupervised Domain Clusters in Pretrained Language Models [61.832234606157286]
We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision.
We propose domain data selection methods based on such models.
We evaluate our data selection methods for neural machine translation across five diverse domains.
arXiv Detail & Related papers (2020-04-05T06:22:16Z) - Do We Really Need to Access the Source Data? Source Hypothesis Transfer
for Unsupervised Domain Adaptation [102.67010690592011]
Unsupervised adaptationUDA (UDA) aims to leverage the knowledge learned from a labeled source dataset to solve similar tasks in a new unlabeled domain.
Prior UDA methods typically require to access the source data when learning to adapt the model.
This work tackles a practical setting where only a trained source model is available and how we can effectively utilize such a model without source data to solve UDA problems.
arXiv Detail & Related papers (2020-02-20T03:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.