SpliceMix: A Cross-scale and Semantic Blending Augmentation Strategy for
Multi-label Image Classification
- URL: http://arxiv.org/abs/2311.15200v1
- Date: Sun, 26 Nov 2023 05:45:27 GMT
- Title: SpliceMix: A Cross-scale and Semantic Blending Augmentation Strategy for
Multi-label Image Classification
- Authors: Lei Wang and Yibing Zhan and Leilei Ma and Dapeng Tao and Liang Ding
and Chen Gong
- Abstract summary: We introduce a simple but effective augmentation strategy for multi-label image classification, namely SpliceMix.
The "splice" in our method is two-fold: 1) Each mixed image is a splice of several downsampled images in the form of a grid, where the semantics of images attending to mixing are blended without object deficiencies for alleviating co-occurred bias; 2) We splice mixed images and the original mini-batch to form a new SpliceMixed mini-batch, which allows an image with different scales to contribute to training together.
- Score: 46.8141860303439
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, Mix-style data augmentation methods (e.g., Mixup and CutMix) have
shown promising performance in various visual tasks. However, these methods are
primarily designed for single-label images, ignoring the considerable
discrepancies between single- and multi-label images, i.e., a multi-label image
involves multiple co-occurred categories and fickle object scales. On the other
hand, previous multi-label image classification (MLIC) methods tend to design
elaborate models, bringing expensive computation. In this paper, we introduce a
simple but effective augmentation strategy for multi-label image
classification, namely SpliceMix. The "splice" in our method is two-fold: 1)
Each mixed image is a splice of several downsampled images in the form of a
grid, where the semantics of images attending to mixing are blended without
object deficiencies for alleviating co-occurred bias; 2) We splice mixed images
and the original mini-batch to form a new SpliceMixed mini-batch, which allows
an image with different scales to contribute to training together. Furthermore,
such splice in our SpliceMixed mini-batch enables interactions between mixed
images and original regular images. We also offer a simple and non-parametric
extension based on consistency learning (SpliceMix-CL) to show the flexible
extensibility of our SpliceMix. Extensive experiments on various tasks
demonstrate that only using SpliceMix with a baseline model (e.g., ResNet)
achieves better performance than state-of-the-art methods. Moreover, the
generalizability of our SpliceMix is further validated by the improvements in
current MLIC methods when married with our SpliceMix. The code is available at
https://github.com/zuiran/SpliceMix.
Related papers
- SUMix: Mixup with Semantic and Uncertain Information [41.99721365685618]
Mixup data augmentation approaches have been applied for various tasks of deep learning.
We propose a novel approach named SUMix to learn the mixing ratio as well as the uncertainty for the mixed samples during the training process.
arXiv Detail & Related papers (2024-07-10T16:25:26Z) - SMMix: Self-Motivated Image Mixing for Vision Transformers [65.809376136455]
CutMix is a vital augmentation strategy that determines the performance and generalization ability of vision transformers (ViTs)
Existing CutMix variants tackle this problem by generating more consistent mixed images or more precise mixed labels.
We propose an efficient and effective Self-Motivated image Mixing method (SMMix) which motivates both image and label enhancement by the model under training itself.
arXiv Detail & Related papers (2022-12-26T00:19:39Z) - CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping [97.05377757299672]
We present a simple method, CropMix, for producing a rich input distribution from the original dataset distribution.
CropMix can be seamlessly applied to virtually any training recipe and neural network architecture performing classification tasks.
We show that CropMix is of benefit to both contrastive learning and masked image modeling towards more powerful representations.
arXiv Detail & Related papers (2022-05-31T16:57:28Z) - AlignMix: Improving representation by interpolating aligned features [19.465347590276114]
We introduce AlignMix, where we geometrically align two images in the feature space.
We show that an autoencoder can still improve representation learning under mixup, without the ever seeing decoded images.
arXiv Detail & Related papers (2021-03-29T07:03:18Z) - MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks [97.08677678499075]
We introduce MixMo, a new framework for learning multi-input multi-output deepworks.
We show that binary mixing in features - particularly with patches from CutMix - enhances results by makingworks stronger and more diverse.
In addition to being easy to implement and adding no cost at inference, our models outperform much costlier data augmented deep ensembles.
arXiv Detail & Related papers (2021-03-10T15:31:02Z) - ResizeMix: Mixing Data with Preserved Object Information and True Labels [57.00554495298033]
We study the importance of the saliency information for mixing data, and find that the saliency information is not so necessary for promoting the augmentation performance.
We propose a more effective but very easily implemented method, namely ResizeMix.
arXiv Detail & Related papers (2020-12-21T03:43:13Z) - SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained
Data [124.95585891086894]
Proposal is called Semantically Proportional Mixing (SnapMix)
It exploits class activation map (CAM) to lessen the label noise in augmenting fine-grained data.
Our method consistently outperforms existing mixed-based approaches.
arXiv Detail & Related papers (2020-12-09T03:37:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.