Related papers: Neural Priming for Sample-Efficient Adaptation

Neural Priming for Sample-Efficient Adaptation

URL: http://arxiv.org/abs/2306.10191v3
Date: Tue, 5 Dec 2023 00:24:56 GMT
Title: Neural Priming for Sample-Efficient Adaptation
Authors: Matthew Wallingford, Vivek Ramanujan, Alex Fang, Aditya Kusupati, Roozbeh Mottaghi, Aniruddha Kembhavi, Ludwig Schmidt, Ali Farhadi
Abstract summary: We propose Neural Priming, a technique for adapting large pretrained models to distribution shifts and downstream tasks. Neural Priming can be performed at test time, even for pretraining as large as LAION-2B.
Score: 92.14357804106787
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose Neural Priming, a technique for adapting large pretrained models to distribution shifts and downstream tasks given few or no labeled examples. Presented with class names or unlabeled test samples, Neural Priming enables the model to recall and conditions its parameters on relevant data seen throughout pretraining, thereby priming it for the test distribution. Neural Priming can be performed at test time, even for pretraining datasets as large as LAION-2B. Performing lightweight updates on the recalled data significantly improves accuracy across a variety of distribution shift and transfer learning benchmarks. Concretely, in the zero-shot setting, we see a 2.45% improvement in accuracy on ImageNet and 3.81% accuracy improvement on average across standard transfer learning benchmarks. Further, using Neural Priming at inference to adapt to distribution shift, we see a 1.41% accuracy improvement on ImageNetV2. These results demonstrate the effectiveness of Neural Priming in addressing the challenge of limited labeled data and changing distributions. Code is available at github.com/RAIVNLab/neural-priming.

Related papers

Neural Conformal Control for Time Series Forecasting [54.96087475179419]
We introduce a neural network conformal prediction method for time series that enhances adaptivity in non-stationary environments. Our approach acts as a neural controller designed to achieve desired target coverage, leveraging auxiliary multi-view data with neural network encoders. We empirically demonstrate significant improvements in coverage and probabilistic accuracy, and find that our method is the only one that combines good calibration with consistency in prediction intervals.
arXiv Detail & Related papers (2024-12-24T03:56:25Z)
KAKURENBO: Adaptively Hiding Samples in Deep Neural Network Training [2.8804804517897935]
We propose a method for hiding the least-important samples during the training of deep neural networks. We adaptively find samples to exclude in a given epoch based on their contribution to the overall learning process. Our method can reduce total training time by up to 22% impacting accuracy only by 0.4% compared to the baseline.
arXiv Detail & Related papers (2023-10-16T06:19:29Z)
Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations. In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z)
Learning Sample Difficulty from Pre-trained Models for Reliable Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization. We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z)
Bayesian Neural Network Versus Ex-Post Calibration For Prediction Uncertainty [0.2343856409260935]
Probabilistic predictions from neural networks account for predictive uncertainty during classification. In practice most datasets are trained on non-probabilistic neural networks which by default do not capture this inherent uncertainty. A plausible alternative to the calibration approach is to use Bayesian neural networks, which directly models a predictive distribution.
arXiv Detail & Related papers (2022-09-29T07:22:19Z)
Neural Clamping: Joint Input Perturbation and Temperature Scaling for Neural Network Calibration [62.4971588282174]
We propose a new post-processing calibration method called Neural Clamping. Our empirical results show that Neural Clamping significantly outperforms state-of-the-art post-processing calibration methods.
arXiv Detail & Related papers (2022-09-23T14:18:39Z)
Confidence-Guided Data Augmentation for Deep Semi-Supervised Training [0.9968241071319184]
We propose a new data augmentation technique for semi-supervised learning settings that emphasizes learning from the most challenging regions of the feature space. We perform experiments on two benchmark RGB datasets: CIFAR-100 and STL-10, and show that the proposed scheme improves classification performance in terms of accuracy and robustness.
arXiv Detail & Related papers (2022-09-16T21:23:19Z)
Sample-dependent Adaptive Temperature Scaling for Improved Calibration [95.7477042886242]
Post-hoc approach to compensate for neural networks being wrong is to perform temperature scaling. We propose to predict a different temperature value for each input, allowing us to adjust the mismatch between confidence and accuracy. We test our method on the ResNet50 and WideResNet28-10 architectures using the CIFAR10/100 and Tiny-ImageNet datasets.
arXiv Detail & Related papers (2022-07-13T14:13:49Z)
Insta-RS: Instance-wise Randomized Smoothing for Improved Robustness and Accuracy [9.50143683501477]
Insta-RS is a multiple-start search algorithm that assigns customized Gaussian variances to test examples. Insta-RS Train is a novel two-stage training algorithm that adaptively adjusts and customizes the noise level of each training example. We show that our method significantly enhances the average certified radius (ACR) as well as the clean data accuracy.
arXiv Detail & Related papers (2021-03-07T19:46:07Z)
Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit [47.324627920761685]
We use recent theoretical advances that characterize the function-space prior to an ensemble of infinitely-wide NNs as a Gaussian process. This gives us a better understanding of the implicit prior NNs place on function space. We also examine the calibration of previous approaches to classification with the NNGP.
arXiv Detail & Related papers (2020-10-14T18:41:54Z)
Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness. The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.