Thumbnail: A Novel Data Augmentation for Convolutional Neural Network
- URL: http://arxiv.org/abs/2103.05342v1
- Date: Tue, 9 Mar 2021 10:45:55 GMT
- Title: Thumbnail: A Novel Data Augmentation for Convolutional Neural Network
- Authors: Tianshu Xie, Xuan Cheng, Minghui Liu, Jiali Deng, Xiaomin Wang, Ming
Liu
- Abstract summary: We get a generated image by reducing an image to a certain size, which is called as the thumbnail, and pasting it in the random position of the original image.
The generated image retains most of the original image information but also has the global information in the thumbnail.
We find that the idea of thumbnail can be perfectly integrated with Mixed Sample Data Augmentation.
- Score: 6.066543113636522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a new data augmentation strategy named Thumbnail,
which aims to strengthen the network's capture of global features. We get a
generated image by reducing an image to a certain size, which is called as the
thumbnail, and pasting it in the random position of the original image. The
generated image not only retains most of the original image information but
also has the global information in the thumbnail. Furthermore, we find that the
idea of thumbnail can be perfectly integrated with Mixed Sample Data
Augmentation, so we paste the thumbnail in another image where the ground truth
labels are also mixed with a certain weight, which makes great achievements on
various computer vision tasks. Extensive experiments show that Thumbnail works
better than the state-of-the-art augmentation strategies across classification,
fine-grained image classification, and object detection. On ImageNet
classification, ResNet50 architecture with our method achieves 79.21% accuracy,
which is more than 2.89% improvement on the baseline.
Related papers
- xT: Nested Tokenization for Larger Context in Large Images [79.37673340393475]
xT is a framework for vision transformers which aggregates global context with local details.
We are able to increase accuracy by up to 8.6% on challenging classification tasks.
arXiv Detail & Related papers (2024-03-04T10:29:58Z) - Spatial-Semantic Collaborative Cropping for User Generated Content [32.490403964193014]
A large amount of User Generated Content (UGC) is uploaded to the Internet daily and displayed to people world-wide.
Previous methods merely consider the aesthetics of the cropped images while ignoring the content integrity, which is crucial for cropping.
We propose a Spatial-Semantic Collaborative cropping network (S2CNet) for arbitrary user generated content accompanied by a new cropping benchmark.
arXiv Detail & Related papers (2024-01-16T03:25:12Z) - Raw Image Reconstruction with Learned Compact Metadata [61.62454853089346]
We propose a novel framework to learn a compact representation in the latent space serving as the metadata in an end-to-end manner.
We show how the proposed raw image compression scheme can adaptively allocate more bits to image regions that are important from a global perspective.
arXiv Detail & Related papers (2023-02-25T05:29:45Z) - StyleAugment: Learning Texture De-biased Representations by Style
Augmentation without Pre-defined Textures [7.81768535871051]
Recently powerful vision classifiers are biased towards textures, while shape information is overlooked by the models.
A simple attempt by augmenting training images using the artistic style transfer method, called Stylized ImageNet, can reduce the texture bias.
However, Stylized ImageNet approach has two drawbacks in fidelity and diversity.
We propose a StyleAugment by augmenting styles from the mini-batch.
arXiv Detail & Related papers (2021-08-24T07:17:02Z) - Efficient Classification of Very Large Images with Tiny Objects [15.822654320750054]
We present an end-to-end CNN model termed Zoom-In network for classification of large images with tiny objects.
We evaluate our method on two large-image datasets and one gigapixel dataset.
arXiv Detail & Related papers (2021-06-04T20:13:04Z) - Contrastive Learning with Stronger Augmentations [63.42057690741711]
We propose a general framework called Contrastive Learning with Stronger Augmentations(A) to complement current contrastive learning approaches.
Here, the distribution divergence between the weakly and strongly augmented images over the representation bank is adopted to supervise the retrieval of strongly augmented queries.
Experiments showed the information from the strongly augmented images can significantly boost the performance.
arXiv Detail & Related papers (2021-04-15T18:40:04Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - RTIC: Residual Learning for Text and Image Composition using Graph
Convolutional Network [19.017377597937617]
We study the compositional learning of images and texts for image retrieval.
We introduce a novel method that combines the graph convolutional network (GCN) with existing composition methods.
arXiv Detail & Related papers (2021-04-07T09:41:52Z) - Gigapixel Histopathological Image Analysis using Attention-based Neural
Networks [7.1715252990097325]
We propose a CNN structure consisting of a compressing path and a learning path.
Our method integrates both global and local information, is flexible with regard to the size of the input images and only requires weak image-level labels.
arXiv Detail & Related papers (2021-01-25T10:18:52Z) - Focus Longer to See Better:Recursively Refined Attention for
Fine-Grained Image Classification [148.4492675737644]
Deep Neural Network has shown great strides in the coarse-grained image classification task.
In this paper, we try to focus on these marginal differences to extract more representative features.
Our network repetitively focuses on parts of images to spot small discriminative parts among the classes.
arXiv Detail & Related papers (2020-05-22T03:14:18Z) - A U-Net Based Discriminator for Generative Adversarial Networks [86.67102929147592]
We propose an alternative U-Net based discriminator architecture for generative adversarial networks (GANs)
The proposed architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images.
The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics.
arXiv Detail & Related papers (2020-02-28T11:16:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.