Using Language to Extend to Unseen Domains
- URL: http://arxiv.org/abs/2210.09520v6
- Date: Sat, 29 Apr 2023 18:00:13 GMT
- Title: Using Language to Extend to Unseen Domains
- Authors: Lisa Dunlap, Clara Mohri, Devin Guillory, Han Zhang, Trevor Darrell,
Joseph E. Gonzalez, Aditi Raghunathan, Anja Rohrbach
- Abstract summary: It is expensive to collect training data for every possible domain that a vision model may encounter when deployed.
We consider how simply verbalizing the training domain as well as domains we want to extend to but do not have data for can improve robustness.
Using a multimodal model with a joint image and language embedding space, our method LADS learns a transformation of the image embeddings from the training domain to each unseen test domain.
- Score: 81.37175826824625
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is expensive to collect training data for every possible domain that a
vision model may encounter when deployed. We instead consider how simply
verbalizing the training domain (e.g. "photos of birds") as well as domains we
want to extend to but do not have data for (e.g. "paintings of birds") can
improve robustness. Using a multimodal model with a joint image and language
embedding space, our method LADS learns a transformation of the image
embeddings from the training domain to each unseen test domain, while
preserving task relevant information. Without using any images from the unseen
test domain, we show that over the extended domain containing both training and
unseen test domains, LADS outperforms standard fine-tuning and ensemble
approaches over a suite of four benchmarks targeting domain adaptation and
dataset bias.
Related papers
- A Unified Data Augmentation Framework for Low-Resource Multi-Domain Dialogue Generation [52.0964459842176]
Current state-of-the-art dialogue systems heavily rely on extensive training datasets.
We propose a novel data textbfAugmentation framework for textbfMulti-textbfDomain textbfDialogue textbfGeneration, referred to as textbfAMD$2$G.
The AMD$2$G framework consists of a data augmentation process and a two-stage training approach: domain-agnostic training and domain adaptation training.
arXiv Detail & Related papers (2024-06-14T09:52:27Z) - WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization [63.98650220772378]
We present WIDIn, Wording Images for Domain-Invariant representation, to disentangle discriminative visual representation.
We first estimate the language embedding with fine-grained alignment, which can be used to adaptively identify and then remove domain-specific counterpart.
We show that WIDIn can be applied to both pretrained vision-language models like CLIP, and separately trained uni-modal models like MoCo and BERT.
arXiv Detail & Related papers (2024-05-28T17:46:27Z) - Phrase Grounding-based Style Transfer for Single-Domain Generalized
Object Detection [109.58348694132091]
Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains.
This is a practical yet challenging task as it requires the model to address domain shift without incorporating target domain data into training.
We propose a novel phrase grounding-based style transfer approach for the task.
arXiv Detail & Related papers (2024-02-02T10:48:43Z) - Domain-invariant Prototypes for Semantic Segmentation [30.932130453313537]
We present an easy-to-train framework that learns domain-invariant prototypes for domain adaptive semantic segmentation.
Our method involves only one-stage training and does not need to be trained on large-scale un-annotated target images.
arXiv Detail & Related papers (2022-08-12T02:21:05Z) - Batch Normalization Embeddings for Deep Domain Generalization [50.51405390150066]
Domain generalization aims at training machine learning models to perform robustly across different and unseen domains.
We show a significant increase in classification accuracy over current state-of-the-art techniques on popular domain generalization benchmarks.
arXiv Detail & Related papers (2020-11-25T12:02:57Z) - Crossing-Domain Generative Adversarial Networks for Unsupervised
Multi-Domain Image-to-Image Translation [12.692904507625036]
We propose a general framework for unsupervised image-to-image translation across multiple domains.
Our proposed framework consists of a pair of encoders along with a pair of GANs which learns high-level features across different domains to generate diverse and realistic samples from.
arXiv Detail & Related papers (2020-08-27T01:54:07Z) - Multi-Domain Spoken Language Understanding Using Domain- and Task-Aware
Parameterization [78.93669377251396]
Spoken language understanding has been addressed as a supervised learning problem, where a set of training data is available for each domain.
One existing approach solves the problem by conducting multi-domain learning, using shared parameters for joint training across domains.
We propose to improve the parameterization of this method by using domain-specific and task-specific model parameters.
arXiv Detail & Related papers (2020-04-30T15:15:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.