Stochastic Adversarial Networks for Multi-Domain Text Classification
- URL: http://arxiv.org/abs/2406.00044v1
- Date: Tue, 28 May 2024 00:02:38 GMT
- Title: Stochastic Adversarial Networks for Multi-Domain Text Classification
- Authors: Xu Wang, Yuan Wu,
- Abstract summary: We introduce the Adversarial Network (SAN), which innovatively models the parameters of the domain-specific feature extractor.
Our approach integrates domain label smoothing and robust pseudo-label regularization to fortify the stability of adversarial training.
The performance of our SAN, evaluated on two leading MDTC benchmarks, demonstrates its competitive edge against the current state-of-the-art methodologies.
- Score: 9.359945319927675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training has been instrumental in advancing multi-domain text classification (MDTC). Traditionally, MDTC methods employ a shared-private paradigm, with a shared feature extractor for domain-invariant knowledge and individual private feature extractors for domain-specific knowledge. Despite achieving state-of-the-art results, these methods grapple with the escalating model parameters due to the continuous addition of new domains. To address this challenge, we introduce the Stochastic Adversarial Network (SAN), which innovatively models the parameters of the domain-specific feature extractor as a multivariate Gaussian distribution, as opposed to a traditional weight vector. This design allows for the generation of numerous domain-specific feature extractors without a substantial increase in model parameters, maintaining the model's size on par with that of a single domain-specific extractor. Furthermore, our approach integrates domain label smoothing and robust pseudo-label regularization to fortify the stability of adversarial training and to refine feature discriminability, respectively. The performance of our SAN, evaluated on two leading MDTC benchmarks, demonstrates its competitive edge against the current state-of-the-art methodologies. The code is available at https://github.com/wangxu0820/SAN.
Related papers
- Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation [59.41178047749177]
We focus on multi-domain Neural Machine Translation, with the goal of developing efficient models which can handle data from various domains seen during training and are robust to domains unseen during training.
We hypothesize that Sparse Mixture-of-Experts (SMoE) models are a good fit for this task, as they enable efficient model scaling.
We conduct a series of experiments aimed at validating the utility of SMoE for the multi-domain scenario, and find that a straightforward width scaling of Transformer is a simpler and surprisingly more efficient approach in practice, and reaches the same performance level as SMoE.
arXiv Detail & Related papers (2024-07-01T09:45:22Z) - ProtoGMM: Multi-prototype Gaussian-Mixture-based Domain Adaptation Model for Semantic Segmentation [0.8213829427624407]
Domain adaptive semantic segmentation aims to generate accurate and dense predictions for an unlabeled target domain.
We propose the ProtoGMM model, which incorporates the GMM into contrastive losses to perform guided contrastive learning.
To achieve increased intra-class semantic similarity, decreased inter-class similarity, and domain alignment between the source and target domains, we employ multi-prototype contrastive learning.
arXiv Detail & Related papers (2024-06-27T14:50:50Z) - Uncertainty-guided Contrastive Learning for Single Source Domain Generalisation [15.907643838530655]
In this paper, we introduce a novel model referred to as Contrastive Uncertainty Domain Generalisation Network (CUDGNet)
The key idea is to augment the source capacity in both input and label spaces through the fictitious domain generator.
Our method also provides efficient uncertainty estimation at inference time from a single forward pass through the generator subnetwork.
arXiv Detail & Related papers (2024-03-12T10:47:45Z) - Regularized Conditional Alignment for Multi-Domain Text Classification [6.629561563470492]
We propose a method called Regularized Conditional Alignment (RCA) to align the joint distributions of domains and classes.
We employ entropy minimization and virtual adversarial training to constrain the uncertainty of predictions pertaining to unlabeled data.
Empirical results on two benchmark datasets demonstrate that our RCA approach outperforms state-of-the-art MDTC techniques.
arXiv Detail & Related papers (2023-12-18T05:52:05Z) - Adaptive Domain Generalization via Online Disagreement Minimization [17.215683606365445]
Domain Generalization aims to safely transfer a model to unseen target domains.
AdaODM adaptively modifies the source model at test time for different target domains.
Results show AdaODM stably improves the generalization capacity on unseen domains.
arXiv Detail & Related papers (2022-08-03T11:51:11Z) - A Novel Mix-normalization Method for Generalizable Multi-source Person
Re-identification [49.548815417844786]
Person re-identification (Re-ID) has achieved great success in the supervised scenario.
It is difficult to directly transfer the supervised model to arbitrary unseen domains due to the model overfitting to the seen source domains.
We propose MixNorm, which consists of domain-aware mix-normalization (DMN) and domain-ware center regularization (DCR)
arXiv Detail & Related papers (2022-01-24T18:09:38Z) - META: Mimicking Embedding via oThers' Aggregation for Generalizable
Person Re-identification [68.39849081353704]
Domain generalizable (DG) person re-identification (ReID) aims to test across unseen domains without access to the target domain data at training time.
This paper presents a new approach called Mimicking Embedding via oThers' Aggregation (META) for DG ReID.
arXiv Detail & Related papers (2021-12-16T08:06:50Z) - Graphical Modeling for Multi-Source Domain Adaptation [56.05348879528149]
Multi-Source Domain Adaptation (MSDA) focuses on transferring the knowledge from multiple source domains to the target domain.
We propose two types of graphical models,i.e. Conditional Random Field for MSDA (CRF-MSDA) and Markov Random Field for MSDA (MRF-MSDA)
We evaluate these two models on four standard benchmark data sets of MSDA with distinct domain shift and data complexity.
arXiv Detail & Related papers (2021-04-27T09:04:22Z) - Mixup Regularized Adversarial Networks for Multi-Domain Text
Classification [16.229317527580072]
Using the shared-private paradigm and adversarial training has significantly improved the performances of multi-domain text classification (MDTC) models.
However, there are two issues for the existing methods.
We propose a mixup regularized adversarial network (MRAN) to address these two issues.
arXiv Detail & Related papers (2021-01-31T15:24:05Z) - Dual Distribution Alignment Network for Generalizable Person
Re-Identification [174.36157174951603]
Domain generalization (DG) serves as a promising solution to handle person Re-Identification (Re-ID)
We present a Dual Distribution Alignment Network (DDAN) which handles this challenge by selectively aligning distributions of multiple source domains.
We evaluate our DDAN on a large-scale Domain Generalization Re-ID (DG Re-ID) benchmark.
arXiv Detail & Related papers (2020-07-27T00:08:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.