Attentive CutMix: An Enhanced Data Augmentation Approach for Deep
Learning Based Image Classification
- URL: http://arxiv.org/abs/2003.13048v2
- Date: Sun, 5 Apr 2020 13:35:20 GMT
- Title: Attentive CutMix: An Enhanced Data Augmentation Approach for Deep
Learning Based Image Classification
- Authors: Devesh Walawalkar, Zhiqiang Shen, Zechun Liu, Marios Savvides
- Abstract summary: We propose Attentive CutMix, a naturally enhanced augmentation strategy based on CutMix.
In each training iteration, we choose the most descriptive regions based on the intermediate attention maps from a feature extractor.
Our proposed method is simple yet effective, easy to implement and can boost the baseline significantly.
- Score: 58.20132466198622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convolutional neural networks (CNN) are capable of learning robust
representation with different regularization methods and activations as
convolutional layers are spatially correlated. Based on this property, a large
variety of regional dropout strategies have been proposed, such as Cutout,
DropBlock, CutMix, etc. These methods aim to promote the network to generalize
better by partially occluding the discriminative parts of objects. However, all
of them perform this operation randomly, without capturing the most important
region(s) within an object. In this paper, we propose Attentive CutMix, a
naturally enhanced augmentation strategy based on CutMix. In each training
iteration, we choose the most descriptive regions based on the intermediate
attention maps from a feature extractor, which enables searching for the most
discriminative parts in an image. Our proposed method is simple yet effective,
easy to implement and can boost the baseline significantly. Extensive
experiments on CIFAR-10/100, ImageNet datasets with various CNN architectures
(in a unified setting) demonstrate the effectiveness of our proposed method,
which consistently outperforms the baseline CutMix and other methods by a
significant margin.
Related papers
- NubbleDrop: A Simple Way to Improve Matching Strategy for Prompted One-Shot Segmentation [2.2559617939136505]
We propose a simple and training-free method to enhance the validity and robustness of the matching strategy.
The core concept involves randomly dropping feature channels (setting them to zero) during the matching process.
This technique mimics discarding pathological nubbles, and it can be seamlessly applied to other similarity computing scenarios.
arXiv Detail & Related papers (2024-05-19T08:00:38Z) - Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - Interpretable Small Training Set Image Segmentation Network Originated
from Multi-Grid Variational Model [5.283735137946097]
Deep learning (DL) methods have been proposed and widely used for image segmentation.
DL methods usually require a large amount of manually segmented data as training data and suffer from poor interpretability.
In this paper, we replace the hand-crafted regularity term in the MS model with a data adaptive generalized learnable regularity term.
arXiv Detail & Related papers (2023-06-25T02:34:34Z) - Decoupled Mixup for Generalized Visual Recognition [71.13734761715472]
We propose a novel "Decoupled-Mixup" method to train CNN models for visual recognition.
Our method decouples each image into discriminative and noise-prone regions, and then heterogeneously combines these regions to train CNN models.
Experiment results show the high generalization performance of our method on testing data that are composed of unseen contexts.
arXiv Detail & Related papers (2022-10-26T15:21:39Z) - Adaptive Convolutional Dictionary Network for CT Metal Artifact
Reduction [62.691996239590125]
We propose an adaptive convolutional dictionary network (ACDNet) for metal artifact reduction.
Our ACDNet can automatically learn the prior for artifact-free CT images via training data and adaptively adjust the representation kernels for each input CT image.
Our method inherits the clear interpretability of model-based methods and maintains the powerful representation ability of learning-based methods.
arXiv Detail & Related papers (2022-05-16T06:49:36Z) - New SAR target recognition based on YOLO and very deep multi-canonical
correlation analysis [0.1503974529275767]
This paper proposes a robust feature extraction method for SAR image target classification by adaptively fusing effective features from different CNN layers.
Experiments on the MSTAR dataset demonstrate that the proposed method outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-10-28T18:10:26Z) - Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup [19.680580983094323]
Puzzle Mix is a mixup method for explicitly utilizing the saliency information and the underlying statistics of the natural examples.
Our experiments show Puzzle Mix achieves the state of the art generalization and the adversarial robustness results.
arXiv Detail & Related papers (2020-09-15T10:10:23Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Deep Unfolding Network for Image Super-Resolution [159.50726840791697]
This paper proposes an end-to-end trainable unfolding network which leverages both learning-based methods and model-based methods.
The proposed network inherits the flexibility of model-based methods to super-resolve blurry, noisy images for different scale factors via a single model.
arXiv Detail & Related papers (2020-03-23T17:55:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.