On Feature Normalization and Data Augmentation
- URL: http://arxiv.org/abs/2002.11102v3
- Date: Tue, 30 Mar 2021 18:00:00 GMT
- Title: On Feature Normalization and Data Augmentation
- Authors: Boyi Li and Felix Wu and Ser-Nam Lim and Serge Belongie and Kilian Q.
Weinberger
- Abstract summary: Moment Exchange encourages the model to utilize the moment information also for recognition models.
We replace the moments of the learned features of one training image by those of another, and also interpolate the target labels.
As our approach is fast, operates entirely in feature space, and mixes different signals than prior methods, one can effectively combine it with existing augmentation approaches.
- Score: 55.115583969831
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The moments (a.k.a., mean and standard deviation) of latent features are
often removed as noise when training image recognition models, to increase
stability and reduce training time. However, in the field of image generation,
the moments play a much more central role. Studies have shown that the moments
extracted from instance normalization and positional normalization can roughly
capture style and shape information of an image. Instead of being discarded,
these moments are instrumental to the generation process. In this paper we
propose Moment Exchange, an implicit data augmentation method that encourages
the model to utilize the moment information also for recognition models.
Specifically, we replace the moments of the learned features of one training
image by those of another, and also interpolate the target labels -- forcing
the model to extract training signal from the moments in addition to the
normalized features. As our approach is fast, operates entirely in feature
space, and mixes different signals than prior methods, one can effectively
combine it with existing augmentation approaches. We demonstrate its efficacy
across several recognition benchmark data sets where it improves the
generalization capability of highly competitive baseline networks with
remarkable consistency.
Related papers
- Enhancing Consistency-Based Image Generation via Adversarialy-Trained Classification and Energy-Based Discrimination [13.238373528922194]
We propose a novel technique for post-processing Consistency-based generated images, enhancing their perceptual quality.
Our approach utilizes a joint classifier-discriminator model, in which both portions are trained adversarially.
By employing example-specific projected gradient under the guidance of this joint machine, we refine synthesized images and achieve an improved FID scores on the ImageNet 64x64 dataset.
arXiv Detail & Related papers (2024-05-25T14:53:52Z) - EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training [79.96741042766524]
We reformulate the training curriculum as a soft-selection function.
We show that exposing the contents of natural images can be readily achieved by the intensity of data augmentation.
The resulting method, EfficientTrain++, is simple, general, yet surprisingly effective.
arXiv Detail & Related papers (2024-05-14T17:00:43Z) - Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - Active Generation for Image Classification [50.18107721267218]
We propose to address the efficiency of image generation by focusing on the specific needs and characteristics of the model.
With a central tenet of active learning, our method, named ActGen, takes a training-aware approach to image generation.
arXiv Detail & Related papers (2024-03-11T08:45:31Z) - Data-efficient Event Camera Pre-training via Disentangled Masked
Modeling [20.987277885575963]
We present a new data-supervised voxel-based self-supervised learning method for event cameras.
Our method overcomes the limitations of previous methods, which either sacrifice temporal information or directly employ paired image data.
It exhibits excellent generalization performance and demonstrates significant improvements across various tasks with fewer parameters and lower computational costs.
arXiv Detail & Related papers (2024-03-01T10:02:25Z) - Training on Thin Air: Improve Image Classification with Generated Data [28.96941414724037]
Diffusion Inversion is a simple yet effective method to generate diverse, high-quality training data for image classification.
Our approach captures the original data distribution and ensures data coverage by inverting images to the latent space of Stable Diffusion.
We identify three key components that allow our generated images to successfully supplant the original dataset.
arXiv Detail & Related papers (2023-05-24T16:33:02Z) - Learning Discriminative Shrinkage Deep Networks for Image Deconvolution [122.79108159874426]
We propose an effective non-blind deconvolution approach by learning discriminative shrinkage functions to implicitly model these terms.
Experimental results show that the proposed method performs favorably against the state-of-the-art ones in terms of efficiency and accuracy.
arXiv Detail & Related papers (2021-11-27T12:12:57Z) - Encoding Robustness to Image Style via Adversarial Feature Perturbations [72.81911076841408]
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce robust models.
Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training.
arXiv Detail & Related papers (2020-09-18T17:52:34Z) - Two-Level Adversarial Visual-Semantic Coupling for Generalized Zero-shot
Learning [21.89909688056478]
We propose a new two-level joint idea to augment the generative network with an inference network during training.
This provides strong cross-modal interaction for effective transfer of knowledge between visual and semantic domains.
We evaluate our approach on four benchmark datasets against several state-of-the-art methods, and show its performance.
arXiv Detail & Related papers (2020-07-15T15:34:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.