Multi-Domain Learning From Insufficient Annotations
- URL: http://arxiv.org/abs/2305.02757v3
- Date: Fri, 28 Jul 2023 06:04:11 GMT
- Title: Multi-Domain Learning From Insufficient Annotations
- Authors: Rui He, Shengcai Liu, Jiahao Wu, Shan He, Ke Tang
- Abstract summary: Multi-domain learning refers to simultaneously constructing a model or a set of models on datasets collected from different domains.
In this paper, we introduce a novel method called multi-domain contrastive learning to alleviate the impact of insufficient annotations.
Experimental results across five datasets demonstrate that MDCL brings noticeable improvement over various SP models.
- Score: 26.83058974786833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-domain learning (MDL) refers to simultaneously constructing a model or
a set of models on datasets collected from different domains. Conventional
approaches emphasize domain-shared information extraction and domain-private
information preservation, following the shared-private framework (SP models),
which offers significant advantages over single-domain learning. However, the
limited availability of annotated data in each domain considerably hinders the
effectiveness of conventional supervised MDL approaches in real-world
applications. In this paper, we introduce a novel method called multi-domain
contrastive learning (MDCL) to alleviate the impact of insufficient annotations
by capturing both semantic and structural information from both labeled and
unlabeled data.Specifically, MDCL comprises two modules: inter-domain semantic
alignment and intra-domain contrast. The former aims to align annotated
instances of the same semantic category from distinct domains within a shared
hidden space, while the latter focuses on learning a cluster structure of
unlabeled instances in a private hidden space for each domain. MDCL is readily
compatible with many SP models, requiring no additional model parameters and
allowing for end-to-end training. Experimental results across five textual and
image multi-domain datasets demonstrate that MDCL brings noticeable improvement
over various SP models.Furthermore, MDCL can further be employed in
multi-domain active learning (MDAL) to achieve a superior initialization,
eventually leading to better overall performance.
Related papers
- Improving Intrusion Detection with Domain-Invariant Representation Learning in Latent Space [4.871119861180455]
We introduce a two-phase representation learning technique using multi-task learning.
We disentangle the latent space by minimizing the mutual information between the prior and latent space.
We assess the model's efficacy across multiple cybersecurity datasets.
arXiv Detail & Related papers (2023-12-28T17:24:13Z) - Adapting Self-Supervised Representations to Multi-Domain Setups [47.03992469282679]
Current state-of-the-art self-supervised approaches, are effective when trained on individual domains but show limited generalization on unseen domains.
We propose a general-purpose, lightweight Domain Disentanglement Module that can be plugged into any self-supervised encoder.
arXiv Detail & Related papers (2023-09-07T20:05:39Z) - M2D2: A Massively Multi-domain Language Modeling Dataset [76.13062203588089]
We present M2D2, a fine-grained, massively multi-domain corpus for studying domain adaptation (LMs)
Using categories derived from Wikipedia and ArXiv, we organize the domains in each data source into 22 groups.
We show the benefits of adapting the LM along a domain hierarchy; adapting to smaller amounts of fine-grained domain-specific data can lead to larger in-domain performance gains.
arXiv Detail & Related papers (2022-10-13T21:34:52Z) - Multi-Modal Cross-Domain Alignment Network for Video Moment Retrieval [55.122020263319634]
Video moment retrieval (VMR) aims to localize the target moment from an untrimmed video according to a given language query.
In this paper, we focus on a novel task: cross-domain VMR, where fully-annotated datasets are available in one domain but the domain of interest only contains unannotated datasets.
We propose a novel Multi-Modal Cross-Domain Alignment network to transfer the annotation knowledge from the source domain to the target domain.
arXiv Detail & Related papers (2022-09-23T12:58:20Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z) - Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain
Adaptive Semantic Segmentation [102.42638795864178]
We propose a principled meta-learning based approach to OCDA for semantic segmentation.
We cluster target domain into multiple sub-target domains by image styles, extracted in an unsupervised manner.
A meta-learner is thereafter deployed to learn to fuse sub-target domain-specific predictions, conditioned upon the style code.
We learn to online update the model by model-agnostic meta-learning (MAML) algorithm, thus to further improve generalization.
arXiv Detail & Related papers (2020-12-15T13:21:54Z) - Multifaceted Context Representation using Dual Attention for Ontology
Alignment [6.445605125467574]
Ontology alignment is an important research problem that finds application in various fields such as data integration, data transfer, data preparation etc.
We propose VeeAlign, a Deep Learning based model that uses a dual-attention mechanism to compute the contextualized representation of a concept in order to learn alignments.
We validate our approach on various datasets from different domains and in multilingual settings, and show its superior performance over SOTA methods.
arXiv Detail & Related papers (2020-10-16T18:28:38Z) - Learning to Combine: Knowledge Aggregation for Multi-Source Domain
Adaptation [56.694330303488435]
We propose a Learning to Combine for Multi-Source Domain Adaptation (LtC-MSDA) framework.
In the nutshell, a knowledge graph is constructed on the prototypes of various domains to realize the information propagation among semantically adjacent representations.
Our approach outperforms existing methods with a remarkable margin.
arXiv Detail & Related papers (2020-07-17T07:52:44Z) - Unified Multi-Domain Learning and Data Imputation using Adversarial
Autoencoder [5.933303832684138]
We present a novel framework that can combine multi-domain learning (MDL), data imputation (DI) and multi-task learning (MTL)
The core of our method is an adversarial autoencoder that can: (1) learn to produce domain-invariant embeddings to reduce the difference between domains; (2) learn the data distribution for each domain and correctly perform data imputation on missing data.
arXiv Detail & Related papers (2020-03-15T19:55:07Z) - Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN [117.80737222754306]
We present a novel universal object detector called Universal-RCNN.
We first generate a global semantic pool by integrating all high-level semantic representation of all the categories.
An Intra-Domain Reasoning Module learns and propagates the sparse graph representation within one dataset guided by a spatial-aware GCN.
arXiv Detail & Related papers (2020-02-18T07:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.