Stochastic Adversarial Networks for Multi-Domain Text Classification
- URL: http://arxiv.org/abs/2406.00044v1
- Date: Tue, 28 May 2024 00:02:38 GMT
- Title: Stochastic Adversarial Networks for Multi-Domain Text Classification
- Authors: Xu Wang, Yuan Wu,
- Abstract summary: We introduce the Adversarial Network (SAN), which innovatively models the parameters of the domain-specific feature extractor.
Our approach integrates domain label smoothing and robust pseudo-label regularization to fortify the stability of adversarial training.
The performance of our SAN, evaluated on two leading MDTC benchmarks, demonstrates its competitive edge against the current state-of-the-art methodologies.
- Score: 9.359945319927675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training has been instrumental in advancing multi-domain text classification (MDTC). Traditionally, MDTC methods employ a shared-private paradigm, with a shared feature extractor for domain-invariant knowledge and individual private feature extractors for domain-specific knowledge. Despite achieving state-of-the-art results, these methods grapple with the escalating model parameters due to the continuous addition of new domains. To address this challenge, we introduce the Stochastic Adversarial Network (SAN), which innovatively models the parameters of the domain-specific feature extractor as a multivariate Gaussian distribution, as opposed to a traditional weight vector. This design allows for the generation of numerous domain-specific feature extractors without a substantial increase in model parameters, maintaining the model's size on par with that of a single domain-specific extractor. Furthermore, our approach integrates domain label smoothing and robust pseudo-label regularization to fortify the stability of adversarial training and to refine feature discriminability, respectively. The performance of our SAN, evaluated on two leading MDTC benchmarks, demonstrates its competitive edge against the current state-of-the-art methodologies. The code is available at https://github.com/wangxu0820/SAN.
Related papers
- Multisource Collaborative Domain Generalization for Cross-Scene Remote Sensing Image Classification [57.945437355714155]
Cross-scene image classification aims to transfer prior knowledge of ground materials to annotate regions with different distributions.
Existing approaches focus on single-source domain generalization to unseen target domains.
We propose a novel multi-source collaborative domain generalization framework (MS-CDG) based on homogeneity and heterogeneity characteristics of multi-source remote sensing data.
arXiv Detail & Related papers (2024-12-05T06:15:08Z) - Boundless Across Domains: A New Paradigm of Adaptive Feature and Cross-Attention for Domain Generalization in Medical Image Segmentation [1.93061220186624]
Domain-invariant representation learning is a powerful method for domain generalization.
Previous approaches face challenges such as high computational demands, training instability, and limited effectiveness with high-dimensional data.
We propose an Adaptive Feature Blending (AFB) method that generates out-of-distribution samples while exploring the in-distribution space.
arXiv Detail & Related papers (2024-11-22T12:06:24Z) - Feature-Space Semantic Invariance: Enhanced OOD Detection for Open-Set Domain Generalization [10.38552112657656]
We propose a unified framework for open-set domain generalization by introducing Feature-space Semantic Invariance (FSI)
FSI maintains semantic consistency across different domains within the feature space, enabling more accurate detection of OOD instances in unseen domains.
We also adopt a generative model to produce synthetic data with novel domain styles or class labels, enhancing model robustness.
arXiv Detail & Related papers (2024-11-11T21:51:45Z) - Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation [59.41178047749177]
We focus on multi-domain Neural Machine Translation, with the goal of developing efficient models which can handle data from various domains seen during training and are robust to domains unseen during training.
We hypothesize that Sparse Mixture-of-Experts (SMoE) models are a good fit for this task, as they enable efficient model scaling.
We conduct a series of experiments aimed at validating the utility of SMoE for the multi-domain scenario, and find that a straightforward width scaling of Transformer is a simpler and surprisingly more efficient approach in practice, and reaches the same performance level as SMoE.
arXiv Detail & Related papers (2024-07-01T09:45:22Z) - Regularized Conditional Alignment for Multi-Domain Text Classification [6.629561563470492]
We propose a method called Regularized Conditional Alignment (RCA) to align the joint distributions of domains and classes.
We employ entropy minimization and virtual adversarial training to constrain the uncertainty of predictions pertaining to unlabeled data.
Empirical results on two benchmark datasets demonstrate that our RCA approach outperforms state-of-the-art MDTC techniques.
arXiv Detail & Related papers (2023-12-18T05:52:05Z) - Adaptive Domain Generalization via Online Disagreement Minimization [17.215683606365445]
Domain Generalization aims to safely transfer a model to unseen target domains.
AdaODM adaptively modifies the source model at test time for different target domains.
Results show AdaODM stably improves the generalization capacity on unseen domains.
arXiv Detail & Related papers (2022-08-03T11:51:11Z) - A Novel Mix-normalization Method for Generalizable Multi-source Person
Re-identification [49.548815417844786]
Person re-identification (Re-ID) has achieved great success in the supervised scenario.
It is difficult to directly transfer the supervised model to arbitrary unseen domains due to the model overfitting to the seen source domains.
We propose MixNorm, which consists of domain-aware mix-normalization (DMN) and domain-ware center regularization (DCR)
arXiv Detail & Related papers (2022-01-24T18:09:38Z) - META: Mimicking Embedding via oThers' Aggregation for Generalizable
Person Re-identification [68.39849081353704]
Domain generalizable (DG) person re-identification (ReID) aims to test across unseen domains without access to the target domain data at training time.
This paper presents a new approach called Mimicking Embedding via oThers' Aggregation (META) for DG ReID.
arXiv Detail & Related papers (2021-12-16T08:06:50Z) - Graphical Modeling for Multi-Source Domain Adaptation [56.05348879528149]
Multi-Source Domain Adaptation (MSDA) focuses on transferring the knowledge from multiple source domains to the target domain.
We propose two types of graphical models,i.e. Conditional Random Field for MSDA (CRF-MSDA) and Markov Random Field for MSDA (MRF-MSDA)
We evaluate these two models on four standard benchmark data sets of MSDA with distinct domain shift and data complexity.
arXiv Detail & Related papers (2021-04-27T09:04:22Z) - Dual Distribution Alignment Network for Generalizable Person
Re-Identification [174.36157174951603]
Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID)
We present a Dual Distribution Alignment Network (DDAN) which handles this challenge by selectively aligning distributions of multiple source domains.
We evaluate our DDAN on a large-scale Domain Generalization Re-ID (DG Re-ID) benchmark.
arXiv Detail & Related papers (2020-07-27T00:08:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.