Related papers: CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping

CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping

URL: http://arxiv.org/abs/2205.15955v1
Date: Tue, 31 May 2022 16:57:28 GMT
Title: CropMix: Sampling a Rich Input Distribution via Multi-Scale Cropping
Authors: Junlin Han, Lars Petersson, Hongdong Li, Ian Reid
Abstract summary: We present a simple method, CropMix, for producing a rich input distribution from the original dataset distribution. CropMix can be seamlessly applied to virtually any training recipe and neural network architecture performing classification tasks. We show that CropMix is of benefit to both contrastive learning and masked image modeling towards more powerful representations.
Score: 97.05377757299672
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a simple method, CropMix, for the purpose of producing a rich input distribution from the original dataset distribution. Unlike single random cropping, which may inadvertently capture only limited information, or irrelevant information, like pure background, unrelated objects, etc, we crop an image multiple times using distinct crop scales, thereby ensuring that multi-scale information is captured. The new input distribution, serving as training data, useful for a number of vision tasks, is then formed by simply mixing multiple cropped views. We first demonstrate that CropMix can be seamlessly applied to virtually any training recipe and neural network architecture performing classification tasks. CropMix is shown to improve the performance of image classifiers on several benchmark tasks across-the-board without sacrificing computational simplicity and efficiency. Moreover, we show that CropMix is of benefit to both contrastive learning and masked image modeling towards more powerful representations, where preferable results are achieved when learned representations are transferred to downstream tasks. Code is available at GitHub.

Related papers

One Diffusion to Generate Them All [54.82732533013014]
OneDiffusion is a versatile, large-scale diffusion model that supports bidirectional image synthesis and understanding. It enables conditional generation from inputs such as text, depth, pose, layout, and semantic maps. OneDiffusion allows for multi-view generation, camera pose estimation, and instant personalization using sequential image inputs.
arXiv Detail & Related papers (2024-11-25T12:11:05Z)
SpliceMix: A Cross-scale and Semantic Blending Augmentation Strategy for Multi-label Image Classification [46.8141860303439]
We introduce a simple but effective augmentation strategy for multi-label image classification, namely SpliceMix. The "splice" in our method is two-fold: 1) Each mixed image is a splice of several downsampled images in the form of a grid, where the semantics of images attending to mixing are blended without object deficiencies for alleviating co-occurred bias; 2) We splice mixed images and the original mini-batch to form a new SpliceMixed mini-batch, which allows an image with different scales to contribute to training together.
arXiv Detail & Related papers (2023-11-26T05:45:27Z)
Enhanced Performance of Pre-Trained Networks by Matched Augmentation Distributions [10.74023489125222]
We propose a simple solution to address the train-test distributional shift. We combine results for multiple random crops for a test image. This not only matches the train time augmentation but also provides the full coverage of the input image.
arXiv Detail & Related papers (2022-01-19T22:33:00Z)
Sample selection for efficient image annotation [14.695979686066066]
Supervised object detection has been proven to be successful in many benchmark datasets achieving human-level performances. We propose an efficient image selection approach that samples the most informative images from the unlabeled dataset. Our method can reduce up to 80% of manual annotation workload, compared to full manual labeling setting, and performs better than random sampling.
arXiv Detail & Related papers (2021-05-10T21:25:10Z)
ResizeMix: Mixing Data with Preserved Object Information and True Labels [57.00554495298033]
We study the importance of the saliency information for mixing data, and find that the saliency information is not so necessary for promoting the augmentation performance. We propose a more effective but very easily implemented method, namely ResizeMix.
arXiv Detail & Related papers (2020-12-21T03:43:13Z)
SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data [124.95585891086894]
Proposal is called Semantically Proportional Mixing (SnapMix) It exploits class activation map (CAM) to lessen the label noise in augmenting fine-grained data. Our method consistently outperforms existing mixed-based approaches.
arXiv Detail & Related papers (2020-12-09T03:37:30Z)
Data-driven Meta-set Based Fine-Grained Visual Classification [61.083706396575295]
We propose a data-driven meta-set based approach to deal with noisy web images for fine-grained recognition. Specifically, guided by a small amount of clean meta-set, we train a selection net in a meta-learning manner to distinguish in- and out-of-distribution noisy images.
arXiv Detail & Related papers (2020-08-06T03:04:16Z)
Variational Clustering: Leveraging Variational Autoencoders for Image Clustering [8.465172258675763]
Variational Autoencoders (VAEs) naturally lend themselves to learning data distributions in a latent space. We propose a method based on VAEs where we use a Gaussian Mixture prior to help cluster the images accurately. Our method simultaneously learns a prior that captures the latent distribution of the images and a posterior to help discriminate well between data points.
arXiv Detail & Related papers (2020-05-10T09:34:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.