Dense Retrieval Adaptation using Target Domain Description
- URL: http://arxiv.org/abs/2307.02740v1
- Date: Thu, 6 Jul 2023 02:59:47 GMT
- Title: Dense Retrieval Adaptation using Target Domain Description
- Authors: Helia Hashemi, Yong Zhuang, Sachith Sri Ram Kothur, Srivas Prasad,
Edgar Meij, W. Bruce Croft
- Abstract summary: Domain adaptation is the process of adapting a retrieval model to a new domain whose data distribution is different from the source domain.
We introduce a novel automatic data construction pipeline that produces a synthetic document collection, query set, and pseudo relevance labels.
Experiments on five diverse target domains show that adapting dense retrieval models using the constructed synthetic data leads to effective retrieval performance on the target domain.
- Score: 18.120678619163037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In information retrieval (IR), domain adaptation is the process of adapting a
retrieval model to a new domain whose data distribution is different from the
source domain. Existing methods in this area focus on unsupervised domain
adaptation where they have access to the target document collection or
supervised (often few-shot) domain adaptation where they additionally have
access to (limited) labeled data in the target domain. There also exists
research on improving zero-shot performance of retrieval models with no
adaptation. This paper introduces a new category of domain adaptation in IR
that is as-yet unexplored. Here, similar to the zero-shot setting, we assume
the retrieval model does not have access to the target document collection. In
contrast, it does have access to a brief textual description that explains the
target domain. We define a taxonomy of domain attributes in retrieval tasks to
understand different properties of a source domain that can be adapted to a
target domain. We introduce a novel automatic data construction pipeline that
produces a synthetic document collection, query set, and pseudo relevance
labels, given a textual domain description. Extensive experiments on five
diverse target domains show that adapting dense retrieval models using the
constructed synthetic data leads to effective retrieval performance on the
target domain.
Related papers
- Phrase Grounding-based Style Transfer for Single-Domain Generalized
Object Detection [109.58348694132091]
Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains.
This is a practical yet challenging task as it requires the model to address domain shift without incorporating target domain data into training.
We propose a novel phrase grounding-based style transfer approach for the task.
arXiv Detail & Related papers (2024-02-02T10:48:43Z) - A Two-Stage Framework with Self-Supervised Distillation For Cross-Domain Text Classification [46.47734465505251]
Cross-domain text classification aims to adapt models to a target domain that lacks labeled data.
We propose a two-stage framework for cross-domain text classification.
arXiv Detail & Related papers (2023-04-18T06:21:40Z) - Domain-Agnostic Prior for Transfer Semantic Segmentation [197.9378107222422]
Unsupervised domain adaptation (UDA) is an important topic in the computer vision community.
We present a mechanism that regularizes cross-domain representation learning with a domain-agnostic prior (DAP)
Our research reveals that UDA benefits much from better proxies, possibly from other data modalities.
arXiv Detail & Related papers (2022-04-06T09:13:25Z) - Instance Relation Graph Guided Source-Free Domain Adaptive Object
Detection [79.89082006155135]
Unsupervised Domain Adaptation (UDA) is an effective approach to tackle the issue of domain shift.
UDA methods try to align the source and target representations to improve the generalization on the target domain.
The Source-Free Adaptation Domain (SFDA) setting aims to alleviate these concerns by adapting a source-trained model for the target domain without requiring access to the source data.
arXiv Detail & Related papers (2022-03-29T17:50:43Z) - Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available.
This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets.
We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z) - Domain Adaptation with Incomplete Target Domains [61.68950959231601]
We propose an Incomplete Data Imputation based Adversarial Network (IDIAN) model to address this new domain adaptation challenge.
In the proposed model, we design a data imputation module to fill the missing feature values based on the partial observations in the target domain.
We conduct experiments on both cross-domain benchmark tasks and a real world adaptation task with imperfect target domains.
arXiv Detail & Related papers (2020-12-03T00:07:40Z) - Learning to Cluster under Domain Shift [20.00056591000625]
In this work we address the problem of transferring knowledge from a source to a target domain when both source and target data have no annotations.
Inspired by recent works on deep clustering, our approach leverages information from data gathered from multiple source domains.
We show that our method is able to automatically discover relevant semantic information even in presence of few target samples.
arXiv Detail & Related papers (2020-08-11T12:03:01Z) - Cross-domain Self-supervised Learning for Domain Adaptation with Few
Source Labels [78.95901454696158]
We propose a novel Cross-Domain Self-supervised learning approach for domain adaptation.
Our method significantly boosts performance of target accuracy in the new target domain with few source labels.
arXiv Detail & Related papers (2020-03-18T15:11:07Z) - Enlarging Discriminative Power by Adding an Extra Class in Unsupervised
Domain Adaptation [5.377369521932011]
We propose an idea of empowering the discriminativeness: Adding a new, artificial class and training the model on the data together with the GAN-generated samples of the new class.
Our idea is highly generic so that it is compatible with many existing methods such as DANN, VADA, and DIRT-T.
arXiv Detail & Related papers (2020-02-19T07:58:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.