Use the Detection Transformer as a Data Augmenter
- URL: http://arxiv.org/abs/2304.04554v2
- Date: Wed, 26 Apr 2023 02:44:40 GMT
- Title: Use the Detection Transformer as a Data Augmenter
- Authors: Luping Wang, Bin Liu
- Abstract summary: DeMix builds on CutMix, a simple yet highly effective data augmentation technique.
CutMix improves model performance by cutting and pasting a patch from one image onto another, yielding a new image.
DeMix elaborately selects a semantically rich patch, located by a pre-trained DETR.
- Score: 13.15197086963704
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Detection Transformer (DETR) is a Transformer architecture based object
detection model. In this paper, we demonstrate that it can also be used as a
data augmenter. We term our approach as DETR assisted CutMix, or DeMix for
short. DeMix builds on CutMix, a simple yet highly effective data augmentation
technique that has gained popularity in recent years. CutMix improves model
performance by cutting and pasting a patch from one image onto another,
yielding a new image. The corresponding label for this new example is specified
as the weighted average of the original labels, where the weight is
proportional to the area of the patch. CutMix selects a random patch to be cut.
In contrast, DeMix elaborately selects a semantically rich patch, located by a
pre-trained DETR. The label of the new image is specified in the same way as in
CutMix. Experimental results on benchmark datasets for image classification
demonstrate that DeMix significantly outperforms prior art data augmentation
methods including CutMix. Oue code is available at
https://github.com/ZJLAB-AMMI/DeMix.
Related papers
- SpliceMix: A Cross-scale and Semantic Blending Augmentation Strategy for
Multi-label Image Classification [46.8141860303439]
We introduce a simple but effective augmentation strategy for multi-label image classification, namely SpliceMix.
The "splice" in our method is two-fold: 1) Each mixed image is a splice of several downsampled images in the form of a grid, where the semantics of images attending to mixing are blended without object deficiencies for alleviating co-occurred bias; 2) We splice mixed images and the original mini-batch to form a new SpliceMixed mini-batch, which allows an image with different scales to contribute to training together.
arXiv Detail & Related papers (2023-11-26T05:45:27Z) - MixPro: Data Augmentation with MaskMix and Progressive Attention
Labeling for Vision Transformer [17.012278767127967]
We propose MaskMix and Progressive Attention Labeling in image and label space.
From the perspective of image space, we design MaskMix, which mixes two images based on a patch-like grid mask.
From the perspective of label space, we design PAL, which utilizes a progressive factor to dynamically re-weight the attention weights of the mixed attention label.
arXiv Detail & Related papers (2023-04-24T12:38:09Z) - SMMix: Self-Motivated Image Mixing for Vision Transformers [65.809376136455]
CutMix is a vital augmentation strategy that determines the performance and generalization ability of vision transformers (ViTs)
Existing CutMix variants tackle this problem by generating more consistent mixed images or more precise mixed labels.
We propose an efficient and effective Self-Motivated image Mixing method (SMMix) which motivates both image and label enhancement by the model under training itself.
arXiv Detail & Related papers (2022-12-26T00:19:39Z) - OAMixer: Object-aware Mixing Layer for Vision Transformers [73.10651373341933]
We propose OAMixer, which calibrates the patch mixing layers of patch-based models based on the object labels.
By learning an object-centric representation, we demonstrate that OAMixer improves the classification accuracy and background robustness of various patch-based models.
arXiv Detail & Related papers (2022-12-13T14:14:48Z) - CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping [97.05377757299672]
We present a simple method, CropMix, for producing a rich input distribution from the original dataset distribution.
CropMix can be seamlessly applied to virtually any training recipe and neural network architecture performing classification tasks.
We show that CropMix is of benefit to both contrastive learning and masked image modeling towards more powerful representations.
arXiv Detail & Related papers (2022-05-31T16:57:28Z) - ResizeMix: Mixing Data with Preserved Object Information and True Labels [57.00554495298033]
We study the importance of the saliency information for mixing data, and find that the saliency information is not so necessary for promoting the augmentation performance.
We propose a more effective but very easily implemented method, namely ResizeMix.
arXiv Detail & Related papers (2020-12-21T03:43:13Z) - SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained
Data [124.95585891086894]
Proposal is called Semantically Proportional Mixing (SnapMix)
It exploits class activation map (CAM) to lessen the label noise in augmenting fine-grained data.
Our method consistently outperforms existing mixed-based approaches.
arXiv Detail & Related papers (2020-12-09T03:37:30Z) - FMix: Enhancing Mixed Sample Data Augmentation [5.820517596386667]
Mixed Sample Data Augmentation (MSDA) has received increasing attention in recent years.
We show that MixUp distorts learned functions in a way that CutMix does not.
We propose FMix, an MSDA that uses random binary masks obtained by applying a threshold to low frequency images.
arXiv Detail & Related papers (2020-02-27T11:46:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.