Generalizing Few Data to Unseen Domains Flexibly Based on Label Smoothing Integrated with Distributionally Robust Optimization
- URL: http://arxiv.org/abs/2408.05082v1
- Date: Fri, 9 Aug 2024 14:13:33 GMT
- Title: Generalizing Few Data to Unseen Domains Flexibly Based on Label Smoothing Integrated with Distributionally Robust Optimization
- Authors: Yangdi Wang, Zhi-Hai Zhang, Su Xiu Xu, Wenming Guo,
- Abstract summary: Overfitting commonly occurs when applying deep neural networks (DNNs) on small-scale datasets.
Label smoothing (LS) is an effective regularization method to prevent overfitting, avoiding it by mixing one-hot labels with uniform label vectors.
We introduce the distributionally robust optimization (DRO) to LS, achieving shift the existing data distribution flexibly to unseen domains.
- Score: 0.9374652839580183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Overfitting commonly occurs when applying deep neural networks (DNNs) on small-scale datasets, where DNNs do not generalize well from existing data to unseen data. The main reason resulting in overfitting is that small-scale datasets cannot reflect the situations of the real world. Label smoothing (LS) is an effective regularization method to prevent overfitting, avoiding it by mixing one-hot labels with uniform label vectors. However, LS only focuses on labels while ignoring the distribution of existing data. In this paper, we introduce the distributionally robust optimization (DRO) to LS, achieving shift the existing data distribution flexibly to unseen domains when training DNNs. Specifically, we prove that the regularization of LS can be extended to a regularization term for the DNNs parameters when integrating DRO. The regularization term can be utilized to shift existing data to unseen domains and generate new data. Furthermore, we propose an approximate gradient-iteration label smoothing algorithm (GI-LS) to achieve the findings and train DNNs. We prove that the shift for the existing data does not influence the convergence of GI-LS. Since GI-LS incorporates a series of hyperparameters, we further consider using Bayesian optimization (BO) to find the relatively optimal combinations of these hyperparameters. Taking small-scale anomaly classification tasks as a case, we evaluate GI-LS, and the results clearly demonstrate its superior performance.
Related papers
- HyperSORT: Self-Organising Robust Training with hyper-networks [1.1327019820428537]
HyperSORT is a framework using a hyper-network predicting UNets' parameters from latent vectors representing both the image and annotation variability.<n>We validate our method on two 3D abdominal CT public datasets.<n>Latent space clusters yield UNet parameters performing the segmentation task in accordance with the underlying learned systematic bias.
arXiv Detail & Related papers (2025-06-26T16:12:34Z) - SIDDA: SInkhorn Dynamic Domain Adaptation for Image Classification with Equivariant Neural Networks [37.69303106863453]
SIDDA is an out-of-the-box DA training algorithm built upon the Sinkhorn divergence.
We find that SIDDA enhances the generalization capabilities of NNs.
We also study the efficacy of SIDDA on ENNs with respect to the varying group orders of the dihedral group $D_N$.
arXiv Detail & Related papers (2025-01-23T19:29:34Z) - Generalized EXTRA stochastic gradient Langevin dynamics [11.382163777108385]
Langevin algorithms are popular Markov Chain Monte Carlo methods for Bayesian learning.
Their versions such as Langevin dynamics (SGLD) allow iterative learning based on randomly sampled mini-batches.
When data is decentralized across a network of agents subject to communication and privacy constraints, standard SGLD algorithms cannot be applied.
arXiv Detail & Related papers (2024-12-02T21:57:30Z) - Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance.
We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks.
We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z) - Improving Pseudo-labelling and Enhancing Robustness for Semi-Supervised Domain Generalization [7.9776163947539755]
We study the problem of Semi-Supervised Domain Generalization which is crucial for real-world applications like automated healthcare.
We propose new SSDG approach, which utilizes a novel uncertainty-guided pseudo-labelling with model averaging.
Our uncertainty-guided pseudo-labelling (UPL) uses model uncertainty to improve pseudo-labelling selection, addressing poor model calibration under multi-source unlabelled data.
arXiv Detail & Related papers (2024-01-25T05:55:44Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - LD-GAN: Low-Dimensional Generative Adversarial Network for Spectral
Image Generation with Variance Regularization [72.4394510913927]
Deep learning methods are state-of-the-art for spectral image (SI) computational tasks.
GANs enable diverse augmentation by learning and sampling from the data distribution.
GAN-based SI generation is challenging since the high-dimensionality nature of this kind of data hinders the convergence of the GAN training yielding to suboptimal generation.
We propose a statistical regularization to control the low-dimensional representation variance for the autoencoder training and to achieve high diversity of samples generated with the GAN.
arXiv Detail & Related papers (2023-04-29T00:25:02Z) - Distributed Semi-supervised Fuzzy Regression with Interpolation
Consistency Regularization [38.16335448831723]
We propose a distributed semi-supervised fuzzy regression (DSFR) model with fuzzy if-then rules and consistency regularization (ICR)
Experiments results on both artificial and real-world datasets show that the proposed DSFR model can achieve much better performance than the state-of-the-art DSSL algorithm.
arXiv Detail & Related papers (2022-09-18T04:46:51Z) - Integrating Random Effects in Deep Neural Networks [4.860671253873579]
We propose to use the mixed models framework to handle correlated data in deep neural networks.
By treating the effects underlying the correlation structure as random effects, mixed models are able to avoid overfitted parameter estimates.
Our approach which we call LMMNN is demonstrated to improve performance over natural competitors in various correlation scenarios.
arXiv Detail & Related papers (2022-06-07T14:02:24Z) - Shift-Robust GNNs: Overcoming the Limitations of Localized Graph
Training data [52.771780951404565]
Shift-Robust GNN (SR-GNN) is designed to account for distributional differences between biased training data and the graph's true inference distribution.
We show that SR-GNN outperforms other GNN baselines by accuracy, eliminating at least (40%) of the negative effects introduced by biased training data.
arXiv Detail & Related papers (2021-08-02T18:00:38Z) - ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for
Clustering Single-cell Gene Expression Data [11.511172015076532]
Clustering single-cell RNA sequence (scRNA-seq) data poses statistical and computational challenges.
Regularized Auto-Encoder (RAE) based deep neural network models have achieved remarkable success in learning robust low-dimensional representations.
We propose a modified RAE framework (called the scRAE) for effective clustering of the single-cell RNA sequencing data.
arXiv Detail & Related papers (2021-07-16T05:13:31Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Temporal Calibrated Regularization for Robust Noisy Label Learning [60.90967240168525]
Deep neural networks (DNNs) exhibit great success on many tasks with the help of large-scale well annotated datasets.
However, labeling large-scale data can be very costly and error-prone so that it is difficult to guarantee the annotation quality.
We propose a Temporal Calibrated Regularization (TCR) in which we utilize the original labels and the predictions in the previous epoch together.
arXiv Detail & Related papers (2020-07-01T04:48:49Z) - Robust Self-Supervised Convolutional Neural Network for Subspace
Clustering and Classification [0.10152838128195464]
This paper proposes the robust formulation of the self-supervised convolutional subspace clustering network ($S2$ConvSCN)
In a truly unsupervised training environment, Robust $S2$ConvSCN outperforms its baseline version by a significant amount for both seen and unseen data on four well-known datasets.
arXiv Detail & Related papers (2020-04-03T16:07:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.