Mixup Without Hesitation
- URL: http://arxiv.org/abs/2101.04342v1
- Date: Tue, 12 Jan 2021 08:11:08 GMT
- Title: Mixup Without Hesitation
- Authors: Hao Yu, Huanyu Wang, Jianxin Wu
- Abstract summary: We propose mixup Without hesitation (mWh), a concise, effective, and easy-to-use training algorithm.
mWh strikes a good balance between exploration and exploitation by gradually replacing mixup with basic data augmentation.
Our code is open-source and available at https://github.com/yuhao318318/mWh.
- Score: 38.801366276601414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mixup linearly interpolates pairs of examples to form new samples, which is
easy to implement and has been shown to be effective in image classification
tasks. However, there are two drawbacks in mixup: one is that more training
epochs are needed to obtain a well-trained model; the other is that mixup
requires tuning a hyper-parameter to gain appropriate capacity but that is a
difficult task. In this paper, we find that mixup constantly explores the
representation space, and inspired by the exploration-exploitation dilemma in
reinforcement learning, we propose mixup Without hesitation (mWh), a concise,
effective, and easy-to-use training algorithm. We show that mWh strikes a good
balance between exploration and exploitation by gradually replacing mixup with
basic data augmentation. It can achieve a strong baseline with less training
time than original mixup and without searching for optimal hyper-parameter,
i.e., mWh acts as mixup without hesitation. mWh can also transfer to CutMix,
and gain consistent improvement on other machine learning and computer vision
tasks such as object detection. Our code is open-source and available at
https://github.com/yuhao318/mwh
Related papers
- When does mixup promote local linearity in learned representations? [61.079020647847024]
We study the role of Mixup in promoting linearity in the learned network representations.
We investigate these properties of Mixup on vision datasets such as CIFAR-10, CIFAR-100 and SVHN.
arXiv Detail & Related papers (2022-10-28T21:27:33Z) - Harnessing Hard Mixed Samples with Decoupled Regularizer [69.98746081734441]
Mixup is an efficient data augmentation approach that improves the generalization of neural networks by smoothing the decision boundary with mixed data.
In this paper, we propose an efficient mixup objective function with a decoupled regularizer named Decoupled Mixup (DM)
DM can adaptively utilize hard mixed samples to mine discriminative features without losing the original smoothness of mixup.
arXiv Detail & Related papers (2022-03-21T07:12:18Z) - SMILE: Self-Distilled MIxup for Efficient Transfer LEarning [42.59451803498095]
In this work, we propose SMILE - Self-Distilled Mixup for EffIcient Transfer LEarning.
With mixed images as inputs, SMILE regularizes the outputs of CNN feature extractors to learn from the mixed feature vectors of inputs.
The triple regularizer balances the mixup effects in both feature and label spaces while bounding the linearity in-between samples for pre-training tasks.
arXiv Detail & Related papers (2021-03-25T16:02:21Z) - AutoMix: Unveiling the Power of Mixup [34.623943038648164]
We present a flexible, general Automatic Mixup framework which utilizes discriminative features to learn a sample mixing policy adaptively.
We regard mixup as a pretext task and split it into two sub-problems: mixed samples generation and mixup classification.
Experiments on six popular classification benchmarks show that AutoMix consistently outperforms other leading mixup methods.
arXiv Detail & Related papers (2021-03-24T07:21:53Z) - MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks [97.08677678499075]
We introduce MixMo, a new framework for learning multi-input multi-output deepworks.
We show that binary mixing in features - particularly with patches from CutMix - enhances results by makingworks stronger and more diverse.
In addition to being easy to implement and adding no cost at inference, our models outperform much costlier data augmented deep ensembles.
arXiv Detail & Related papers (2021-03-10T15:31:02Z) - Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks [75.69896269357005]
Mixup is the latest data augmentation technique that linearly interpolates input examples and the corresponding labels.
In this paper, we explore how to apply mixup to natural language processing tasks.
We incorporate mixup to transformer-based pre-trained architecture, named "mixup-transformer", for a wide range of NLP tasks.
arXiv Detail & Related papers (2020-10-05T23:37:30Z) - Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup [19.680580983094323]
Puzzle Mix is a mixup method for explicitly utilizing the saliency information and the underlying statistics of the natural examples.
Our experiments show Puzzle Mix achieves the state of the art generalization and the adversarial robustness results.
arXiv Detail & Related papers (2020-09-15T10:10:23Z) - XMixup: Efficient Transfer Learning with Auxiliary Samples by
Cross-domain Mixup [60.07531696857743]
Cross-domain Mixup (XMixup) improves the multitask paradigm for deep transfer learning.
XMixup selects the auxiliary samples from the source dataset and augments training samples via the simple mixup strategy.
Experiment results show that XMixup improves the accuracy by 1.9% on average.
arXiv Detail & Related papers (2020-07-20T16:42:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.