Related papers: Does Data Augmentation Benefit from Split BatchNorms

Does Data Augmentation Benefit from Split BatchNorms

URL: http://arxiv.org/abs/2010.07810v1
Date: Thu, 15 Oct 2020 15:00:43 GMT
Title: Does Data Augmentation Benefit from Split BatchNorms
Authors: Amil Merchant, Barret Zoph, Ekin Dogus Cubuk
Abstract summary: State-of-the-art data augmentation strongly distorts training images, leading to a disparity between examples seen during training and inference. We propose an auxiliary BatchNorm for the potentially out-of-distribution, strongly augmented images. We find that this method significantly improves the performance of common image classification benchmarks such as CIFAR-10, CIFAR-100, and ImageNet.
Score: 29.134017115737507
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Data augmentation has emerged as a powerful technique for improving the performance of deep neural networks and led to state-of-the-art results in computer vision. However, state-of-the-art data augmentation strongly distorts training images, leading to a disparity between examples seen during training and inference. In this work, we explore a recently proposed training paradigm in order to correct for this disparity: using an auxiliary BatchNorm for the potentially out-of-distribution, strongly augmented images. Our experiments then focus on how to define the BatchNorm parameters that are used at evaluation. To eliminate the train-test disparity, we experiment with using the batch statistics defined by clean training images only, yet surprisingly find that this does not yield improvements in model performance. Instead, we investigate using BatchNorm parameters defined by weak augmentations and find that this method significantly improves the performance of common image classification benchmarks such as CIFAR-10, CIFAR-100, and ImageNet. We then explore a fundamental trade-off between accuracy and robustness coming from using different BatchNorm parameters, providing greater insight into the benefits of data augmentation on model performance.

Related papers

Transformer-based Clipped Contrastive Quantization Learning for Unsupervised Image Retrieval [15.982022297570108]
Unsupervised image retrieval aims to learn the important visual characteristics without any given level to retrieve the similar images for a given query image. In this paper, we propose a TransClippedCLR model by encoding the global context of an image using Transformer having local context through patch based processing. Results using the proposed clipped contrastive learning are greatly improved on all datasets as compared to same backbone network with vanilla contrastive learning.
arXiv Detail & Related papers (2024-01-27T09:39:11Z)
VeCAF: Vision-language Collaborative Active Finetuning with Training Objective Awareness [56.87603097348203]
VeCAF uses labels and natural language annotations to perform parametric data selection for PVM finetuning. VeCAF incorporates the finetuning objective to select significant data points that effectively guide the PVM towards faster convergence. On ImageNet, VeCAF uses up to 3.3x less training batches to reach the target performance compared to full finetuning.
arXiv Detail & Related papers (2024-01-15T17:28:37Z)
Image edge enhancement for effective image classification [7.470763273994321]
We propose an edge enhancement-based method to enhance both accuracy and training speed of neural networks. Our approach involves extracting high frequency features, such as edges, from images within the available dataset and fusing them with the original images.
arXiv Detail & Related papers (2024-01-13T10:01:34Z)
DGNet: Dynamic Gradient-Guided Network for Water-Related Optics Image Enhancement [77.0360085530701]
Underwater image enhancement (UIE) is a challenging task due to the complex degradation caused by underwater environments. Previous methods often idealize the degradation process, and neglect the impact of medium noise and object motion on the distribution of image features. Our approach utilizes predicted images to dynamically update pseudo-labels, adding a dynamic gradient to optimize the network's gradient space.
arXiv Detail & Related papers (2023-12-12T06:07:21Z)
Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting [133.55037976429088]
We investigate the adversarial robustness of vision transformers equipped with BERT pretraining (e.g., BEiT, MAE) A surprising observation is that MAE has significantly worse adversarial robustness than other BERT pretraining methods. We propose a simple yet effective way to boost the adversarial robustness of MAE.
arXiv Detail & Related papers (2023-08-20T16:27:17Z)
Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training [59.923672191632065]
We propose a new self-supervised pre-training approach, named Masked and Permuted Vision Transformer (MaPeT) MaPeT employs autoregressive and permuted predictions to capture intra-patch dependencies. Our results demonstrate that MaPeT achieves competitive performance on ImageNet.
arXiv Detail & Related papers (2023-06-12T18:12:19Z)
Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness. We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z)
Dynamic Test-Time Augmentation via Differentiable Functions [3.686808512438363]
DynTTA is an image enhancement method that generates recognition-friendly images without retraining the recognition model. DynTTA is based on differentiable data augmentation techniques and generates a blended image from many augmented images to improve the recognition accuracy under distribution shifts.
arXiv Detail & Related papers (2022-12-09T06:06:47Z)
MetaAugment: Sample-Aware Data Augmentation Policy Learning [20.988767360529362]
We learn a sample-aware data augmentation policy efficiently by formulating it as a sample reweighting problem. An augmentation policy network takes a transformation and the corresponding augmented image as inputs, and outputs a weight to adjust the augmented image loss computed by a task network. At training stage, the task network minimizes the weighted losses of augmented training images, while the policy network minimizes the loss of the task network on a validation set via meta-learning.
arXiv Detail & Related papers (2020-12-22T15:19:27Z)
Differentiable Augmentation for Data-Efficient GAN Training [48.920992130257595]
We propose DiffAugment, a simple method that improves the data efficiency of GANs by imposing various types of differentiable augmentations on both real and fake samples. Our method can generate high-fidelity images using only 100 images without pre-training, while being on par with existing transfer learning algorithms.
arXiv Detail & Related papers (2020-06-18T17:59:01Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.