Selective Output Smoothing Regularization: Regularize Neural Networks by
Softening Output Distributions
- URL: http://arxiv.org/abs/2103.15383v1
- Date: Mon, 29 Mar 2021 07:21:06 GMT
- Title: Selective Output Smoothing Regularization: Regularize Neural Networks by
Softening Output Distributions
- Authors: Xuan Cheng, Tianshu Xie, Xiaomin Wang, Qifeng Weng, Minghui Liu, Jiali
Deng, Ming Liu
- Abstract summary: We propose Selective Output Smoothing Regularization, a novel regularization method for training the Convolutional Neural Networks (CNNs)
Inspired by the diverse effects on training from different samples, Selective Output Smoothing Regularization improves the performance by encouraging the model to produce equal logits on incorrect classes.
This plug-and-play regularization method can be conveniently incorporated into almost any CNN-based project without extra hassle.
- Score: 5.725228891050467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose Selective Output Smoothing Regularization, a novel
regularization method for training the Convolutional Neural Networks (CNNs).
Inspired by the diverse effects on training from different samples, Selective
Output Smoothing Regularization improves the performance by encouraging the
model to produce equal logits on incorrect classes when dealing with samples
that the model classifies correctly and over-confidently. This plug-and-play
regularization method can be conveniently incorporated into almost any
CNN-based project without extra hassle. Extensive experiments have shown that
Selective Output Smoothing Regularization consistently achieves significant
improvement in image classification benchmarks, such as CIFAR-100, Tiny
ImageNet, ImageNet, and CUB-200-2011. Particularly, our method obtains
77.30$\%$ accuracy on ImageNet with ResNet-50, which gains 1.1$\%$ than
baseline (76.2$\%$). We also empirically demonstrate the ability of our method
to make further improvements when combining with other widely used
regularization techniques. On Pascal detection, using the SOSR-trained ImageNet
classifier as the pretrained model leads to better detection performances.
Moreover, we demonstrate the effectiveness of our method in small sample size
problem and imbalanced dataset problem.
Related papers
- SGM-PINN: Sampling Graphical Models for Faster Training of Physics-Informed Neural Networks [4.262342157729123]
SGM-PINN is a graph-based importance sampling framework to improve the training efficacy of Physics-Informed Neural Networks (PINNs)
Experiments demonstrate the advantages of the proposed framework, achieving $3times$ faster convergence compared to prior state-of-the-art sampling methods.
arXiv Detail & Related papers (2024-07-10T04:31:50Z) - Changing the Training Data Distribution to Reduce Simplicity Bias Improves In-distribution Generalization [12.472871440252105]
We show that sharpness-aware minimization (SAM) learns different features more uniformly, particularly in early epochs.
We propose a method that (i) clusters examples based on the network output early in training, (ii) identifies a cluster of examples with similar network output, and (iii) upsamples the rest of examples only once to alleviate the simplicity bias.
arXiv Detail & Related papers (2024-04-27T03:30:50Z) - Improving Generalization via Meta-Learning on Hard Samples [8.96835934244022]
We show that using hard-to-classify instances in the validation set has both a theoretical connection to, and strong empirical evidence of generalization.
We provide an efficient algorithm for training this meta-optimized model, as well as a simple train-twice for careful comparative study.
arXiv Detail & Related papers (2024-03-18T20:33:44Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Neural Priming for Sample-Efficient Adaptation [92.14357804106787]
We propose Neural Priming, a technique for adapting large pretrained models to distribution shifts and downstream tasks.
Neural Priming can be performed at test time, even for pretraining as large as LAION-2B.
arXiv Detail & Related papers (2023-06-16T21:53:16Z) - SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling [1.7767466724342065]
We show that using adaptive sampling in the building blocks of a deep neural network can improve its efficiency.
In particular, we propose SSBNet which is built by inserting sampling layers repeatedly into existing networks like ResNet.
Experiment results show that the proposed SSBNet can achieve competitive image classification and object detection performance on ImageNet and datasets.
arXiv Detail & Related papers (2022-07-23T13:01:55Z) - MEMO: Test Time Robustness via Adaptation and Augmentation [131.28104376280197]
We study the problem of test time robustification, i.e., using the test input to improve model robustness.
Recent prior works have proposed methods for test time adaptation, however, they each introduce additional assumptions.
We propose a simple approach that can be used in any test setting where the model is probabilistic and adaptable.
arXiv Detail & Related papers (2021-10-18T17:55:11Z) - Distribution Mismatch Correction for Improved Robustness in Deep Neural
Networks [86.42889611784855]
normalization methods increase the vulnerability with respect to noise and input corruptions.
We propose an unsupervised non-parametric distribution correction method that adapts the activation distribution of each layer.
In our experiments, we empirically show that the proposed method effectively reduces the impact of intense image corruptions.
arXiv Detail & Related papers (2021-10-05T11:36:25Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z) - Learning to Learn Parameterized Classification Networks for Scalable
Input Images [76.44375136492827]
Convolutional Neural Networks (CNNs) do not have a predictable recognition behavior with respect to the input resolution change.
We employ meta learners to generate convolutional weights of main networks for various input scales.
We further utilize knowledge distillation on the fly over model predictions based on different input resolutions.
arXiv Detail & Related papers (2020-07-13T04:27:25Z) - Deep Residual Flow for Out of Distribution Detection [27.218308616245164]
We present a novel approach that improves upon the state-of-the-art by leveraging an expressive density model based on normalizing flows.
We demonstrate the effectiveness of our method in ResNet and DenseNet architectures trained on various image datasets.
arXiv Detail & Related papers (2020-01-15T16:38:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.