Single Domain Generalization with Model-aware Parametric Batch-wise Mixup
- URL: http://arxiv.org/abs/2502.16064v1
- Date: Sat, 22 Feb 2025 03:45:18 GMT
- Title: Single Domain Generalization with Model-aware Parametric Batch-wise Mixup
- Authors: Marzi Heidari, Yuhong Guo,
- Abstract summary: Single Domain Generalization remains a formidable challenge in the field of machine learning.<n>We propose a novel data augmentation approach, named as Model-aware Parametric Batch-wise Mixup.<n>By exploiting inter-feature correlations, the parameterized mixup generator introduces additional versatility in combining features across a batch of instances.
- Score: 22.709796153794507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single Domain Generalization (SDG) remains a formidable challenge in the field of machine learning, particularly when models are deployed in environments that differ significantly from their training domains. In this paper, we propose a novel data augmentation approach, named as Model-aware Parametric Batch-wise Mixup (MPBM), to tackle the challenge of SDG. MPBM deploys adversarial queries generated with stochastic gradient Langevin dynamics, and produces model-aware augmenting instances with a parametric batch-wise mixup generator network that is carefully designed through an innovative attention mechanism. By exploiting inter-feature correlations, the parameterized mixup generator introduces additional versatility in combining features across a batch of instances, thereby enhancing the capacity to generate highly adaptive and informative synthetic instances for specific queries. The synthetic data produced by this adaptable generator network, guided by informative queries, is expected to significantly enrich the representation space covered by the original training dataset and subsequently enhance the prediction model's generalizability across diverse and previously unseen domains. To prevent excessive deviation from the training data, we further incorporate a real-data alignment-based adversarial loss into the learning process of MPBM, regularizing any tendencies toward undesirable expansions. We conduct extensive experiments on several benchmark datasets. The empirical results demonstrate that by augmenting the training set with informative synthesis data, our proposed MPBM method achieves the state-of-the-art performance for single domain generalization.
Related papers
- DPGIIL: Dirichlet Process-Deep Generative Model-Integrated Incremental Learning for Clustering in Transmissibility-based Online Structural Anomaly Detection [0.0]
This work proposes the Dirichlet process-deep generative model-integrated incremental learning (DPGIIL) for clustering.<n>By introducing a DPMM prior to the latent space of DGMs, DPGIIL automatically captures dissimilarities in extracted latent representations, enabling both generative modeling and clustering.<n>Two case studies show that the proposed method outperforms some state-of-the-art approaches in structural anomaly detection and clustering.
arXiv Detail & Related papers (2024-12-06T05:18:58Z) - MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction [88.65168366064061]
We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference.
Our framework leads to a family of three novel objectives that are all simulation-free, and thus scalable.
We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.
arXiv Detail & Related papers (2024-10-10T17:18:30Z) - Toward the Identifiability of Comparative Deep Generative Models [7.5479347719819865]
We propose a theory of identifiability for comparative Deep Generative Models (DGMs)
We show that, while these models lack identifiability across a general class of mixing functions, they surprisingly become identifiable when the mixing function is piece-wise affine.
We also investigate the impact of model misspecification, and empirically show that previously proposed regularization techniques for fitting comparative DGMs help with identifiability when the number of latent variables is not known in advance.
arXiv Detail & Related papers (2024-01-29T06:10:54Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Implicit Counterfactual Data Augmentation for Robust Learning [24.795542869249154]
This study proposes an Implicit Counterfactual Data Augmentation method to remove spurious correlations and make stable predictions.
Experiments have been conducted across various biased learning scenarios covering both image and text datasets.
arXiv Detail & Related papers (2023-04-26T10:36:40Z) - Distributional Learning of Variational AutoEncoder: Application to
Synthetic Data Generation [0.7614628596146602]
We propose a new approach that expands the model capacity without sacrificing the computational advantages of the VAE framework.
Our VAE model's decoder is composed of an infinite mixture of asymmetric Laplace distribution.
We apply the proposed model to synthetic data generation, and particularly, our model demonstrates superiority in easily adjusting the level of data privacy.
arXiv Detail & Related papers (2023-02-22T11:26:50Z) - DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained
Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states.
We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs.
Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z) - Deep Generative Modeling on Limited Data with Regularization by
Nontransferable Pre-trained Models [32.52492468276371]
We propose regularized deep generative model (Reg-DGM) to reduce the variance of generative modeling with limited data.
Reg-DGM uses a pre-trained model to optimize a weighted sum of a certain divergence and the expectation of an energy function.
Empirically, with various pre-trained feature extractors and a data-dependent energy function, Reg-DGM consistently improves the generation performance of strong DGMs with limited data.
arXiv Detail & Related papers (2022-08-30T10:28:50Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference [55.35176938713946]
We develop deep autoencoding topic model (DATM) that uses a hierarchy of gamma distributions to construct its multi-stochastic-layer generative network.
We propose a Weibull upward-downward variational encoder that deterministically propagates information upward via a deep neural network, followed by a downward generative model.
The efficacy and scalability of our models are demonstrated on both unsupervised and supervised learning tasks on big corpora.
arXiv Detail & Related papers (2020-06-15T22:22:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.