X-model: Improving Data Efficiency in Deep Learning with A Minimax Model
- URL: http://arxiv.org/abs/2110.04572v1
- Date: Sat, 9 Oct 2021 13:56:48 GMT
- Title: X-model: Improving Data Efficiency in Deep Learning with A Minimax Model
- Authors: Ximei Wang, Xinyang Chen, Jianmin Wang, Mingsheng Long
- Abstract summary: We aim at improving data efficiency for both classification and regression setups in deep learning.
To take the power of both worlds, we propose a novel X-model.
X-model plays a minimax game between the feature extractor and task-specific heads.
- Score: 78.55482897452417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To mitigate the burden of data labeling, we aim at improving data efficiency
for both classification and regression setups in deep learning. However, the
current focus is on classification problems while rare attention has been paid
to deep regression, which usually requires more human effort to labeling.
Further, due to the intrinsic difference between categorical and continuous
label space, the common intuitions for classification, e.g., cluster
assumptions or pseudo labeling strategies, cannot be naturally adapted into
deep regression. To this end, we first delved into the existing data-efficient
methods in deep learning and found that they either encourage invariance to
data stochasticity (e.g., consistency regularization under different
augmentations) or model stochasticity (e.g., difference penalty for predictions
of models with different dropout). To take the power of both worlds, we propose
a novel X-model by simultaneously encouraging the invariance to {data
stochasticity} and {model stochasticity}. Further, the X-model plays a minimax
game between the feature extractor and task-specific heads to further enhance
the invariance to model stochasticity. Extensive experiments verify the
superiority of the X-model among various tasks, from a single-value prediction
task of age estimation to a dense-value prediction task of keypoint
localization, a 2D synthetic, and a 3D realistic dataset, as well as a
multi-category object recognition task.
Related papers
- Efficient and Generalizable Certified Unlearning: A Hessian-free Recollection Approach [8.875278412741695]
Machine unlearning strives to uphold the data owners' right to be forgotten by enabling models to selectively forget specific data.
We develop an algorithm that achieves near-instantaneous unlearning as it only requires a vector addition operation.
arXiv Detail & Related papers (2024-04-02T07:54:18Z) - For Better or For Worse? Learning Minimum Variance Features With Label Augmentation [7.183341902583164]
In this work, we analyze the role played by the label augmentation aspect of data augmentation methods.
We first prove that linear models on binary classification data trained with label augmentation learn only the minimum variance features in the data.
We then use our techniques to show that even for nonlinear models and general data distributions, the label smoothing and Mixup losses are lower bounded by a function of the model output variance.
arXiv Detail & Related papers (2024-02-10T01:36:39Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot
Text Classification Tasks [75.42002070547267]
We propose a self evolution learning (SE) based mixup approach for data augmentation in text classification.
We introduce a novel instance specific label smoothing approach, which linearly interpolates the model's output and one hot labels of the original samples to generate new soft for label mixing up.
arXiv Detail & Related papers (2023-05-22T23:43:23Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - Semi-Supervised Deep Regression with Uncertainty Consistency and
Variational Model Ensembling via Bayesian Neural Networks [31.67508478764597]
We propose a novel approach to semi-supervised regression, namely Uncertainty-Consistent Variational Model Ensembling (UCVME)
Our consistency loss significantly improves uncertainty estimates and allows higher quality pseudo-labels to be assigned greater importance under heteroscedastic regression.
Experiments show that our method outperforms state-of-the-art alternatives on different tasks and can be competitive with supervised methods that use full labels.
arXiv Detail & Related papers (2023-02-15T10:40:51Z) - Flexible Model Aggregation for Quantile Regression [92.63075261170302]
Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions.
We investigate methods for aggregating any number of conditional quantile models.
All of the models we consider in this paper can be fit using modern deep learning toolkits.
arXiv Detail & Related papers (2021-02-26T23:21:16Z) - Improving Maximum Likelihood Training for Text Generation with Density
Ratio Estimation [51.091890311312085]
We propose a new training scheme for auto-regressive sequence generative models, which is effective and stable when operating at large sample space encountered in text generation.
Our method stably outperforms Maximum Likelihood Estimation and other state-of-the-art sequence generative models in terms of both quality and diversity.
arXiv Detail & Related papers (2020-07-12T15:31:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.