X-model: Improving Data Efficiency in Deep Learning with A Minimax Model
- URL: http://arxiv.org/abs/2110.04572v1
- Date: Sat, 9 Oct 2021 13:56:48 GMT
- Title: X-model: Improving Data Efficiency in Deep Learning with A Minimax Model
- Authors: Ximei Wang, Xinyang Chen, Jianmin Wang, Mingsheng Long
- Abstract summary: We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
- Score: 78.55482897452417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To mitigate the burden of data labeling, we aim at improving data efficiency
for both classification and regression setups in deep learning. However, the
current focus is on classification problems while rare attention has been paid
to deep regression, which usually requires more human effort to labeling.
Further, due to the intrinsic difference between categorical and continuous
label space, the common intuitions for classification, e.g., cluster
assumptions or pseudo labeling strategies, cannot be naturally adapted into
deep regression. To this end, we first delved into the existing data-efficient
methods in deep learning and found that they either encourage invariance to
data stochasticity (e.g., consistency regularization under different
augmentations) or model stochasticity (e.g., difference penalty for predictions
of models with different dropout). To take the power of both worlds, we propose
a novel X-model by simultaneously encouraging the invariance to {data
stochasticity} and {model stochasticity}. Further, the X-model plays a minimax
game between the feature extractor and task-specific heads to further enhance
the invariance to model stochasticity. Extensive experiments verify the
superiority of the X-model among various tasks, from a single-value prediction
task of age estimation to a dense-value prediction task of keypoint
localization, a 2D synthetic, and a 3D realistic dataset, as well as a
multi-category object recognition task.
Related papers
- Source-Free Test-Time Adaptation For Online Surface-Defect Detection [29.69030283193086]
We propose a novel test-time adaptation surface-defect detection approach.
It adapts pre-trained models to new domains and classes during inference.
Experiments demonstrate it outperforms state-of-the-art techniques.
arXiv Detail & Related papers (2024-08-18T14:24:05Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot
Text Classification Tasks [75.42002070547267]
We propose a self evolution learning (SE) based mixup approach for data augmentation in text classification.
We introduce a novel instance specific label smoothing approach, which linearly interpolates the model's output and one hot labels of the original samples to generate new soft for label mixing up.
arXiv Detail & Related papers (2023-05-22T23:43:23Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - Semi-Supervised Deep Regression with Uncertainty Consistency and
Variational Model Ensembling via Bayesian Neural Networks [31.67508478764597]
We propose a novel approach to semi-supervised regression, namely Uncertainty-Consistent Variational Model Ensembling (UCVME)
Our consistency loss significantly improves uncertainty estimates and allows higher quality pseudo-labels to be assigned greater importance under heteroscedastic regression.
Experiments show that our method outperforms state-of-the-art alternatives on different tasks and can be competitive with supervised methods that use full labels.
arXiv Detail & Related papers (2023-02-15T10:40:51Z) - Entropy optimized semi-supervised decomposed vector-quantized
variational autoencoder model based on transfer learning for multiclass text
classification and generation [3.9318191265352196]
We propose a semisupervised discrete latent variable model for multi-class text classification and text generation.
The proposed model employs the concept of transfer learning for training a quantized transformer model.
Experimental results indicate that the proposed model has surpassed the state-of-the-art models remarkably.
arXiv Detail & Related papers (2021-11-10T07:07:54Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Improving Maximum Likelihood Training for Text Generation with Density
Ratio Estimation [51.091890311312085]
We propose a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation.
Our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.
arXiv Detail & Related papers (2020-07-12T15:31:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.