If your data distribution shifts, use self-learning
- URL: http://arxiv.org/abs/2104.12928v4
- Date: Thu, 7 Dec 2023 17:58:04 GMT
- Title: If your data distribution shifts, use self-learning
- Authors: Evgenia Rusak, Steffen Schneider, George Pachitariu, Luisa Eck, Peter
Gehler, Oliver Bringmann, Wieland Brendel, Matthias Bethge
- Abstract summary: Self-learning techniques like entropy and pseudo-labeling are simple and effective at improving performance of a deployed computer vision model under systematic domain shifts.
We conduct a wide range of large-scale experiments and show consistent improvements irrespective of the model architecture.
- Score: 24.23584770840611
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We demonstrate that self-learning techniques like entropy minimization and
pseudo-labeling are simple and effective at improving performance of a deployed
computer vision model under systematic domain shifts. We conduct a wide range
of large-scale experiments and show consistent improvements irrespective of the
model architecture, the pre-training technique or the type of distribution
shift. At the same time, self-learning is simple to use in practice because it
does not require knowledge or access to the original training data or scheme,
is robust to hyperparameter choices, is straight-forward to implement and
requires only a few adaptation epochs. This makes self-learning techniques
highly attractive for any practitioner who applies machine learning algorithms
in the real world. We present state-of-the-art adaptation results on CIFAR10-C
(8.5% error), ImageNet-C (22.0% mCE), ImageNet-R (17.4% error) and ImageNet-A
(14.8% error), theoretically study the dynamics of self-supervised adaptation
methods and propose a new classification dataset (ImageNet-D) which is
challenging even with adaptation.
Related papers
- Self-Supervised Learning in Deep Networks: A Pathway to Robust Few-Shot Classification [0.0]
We first pre-train the model with self-supervision to enable it to learn common feature expressions on a large amount of unlabeled data.
Then fine-tune it on the few-shot dataset Mini-ImageNet to improve the model's accuracy and generalization ability under limited data.
arXiv Detail & Related papers (2024-11-19T01:01:56Z) - Efficiency for Free: Ideal Data Are Transportable Representations [12.358393766570732]
We investigate the efficiency properties of data from both optimization and generalization perspectives.
We propose the Representation Learning Accelerator (algopt), which promotes the formation and utilization of efficient data.
arXiv Detail & Related papers (2024-05-23T15:06:02Z) - A Simple and Efficient Baseline for Data Attribution on Images [107.12337511216228]
Current state-of-the-art approaches require a large ensemble of as many as 300,000 models to accurately attribute model predictions.
In this work, we focus on a minimalist baseline, utilizing the feature space of a backbone pretrained via self-supervised learning to perform data attribution.
Our method is model-agnostic and scales easily to large datasets.
arXiv Detail & Related papers (2023-11-03T17:29:46Z) - No Data Augmentation? Alternative Regularizations for Effective Training
on Small Datasets [0.0]
We study alternative regularization strategies to push the limits of supervised learning on small image classification datasets.
In particular, we employ a agnostic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters.
We reach a test accuracy of 66.5%, on par with the best state-of-the-art methods.
arXiv Detail & Related papers (2023-09-04T16:13:59Z) - EfficientTrain: Exploring Generalized Curriculum Learning for Training
Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers)
As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z) - CoV-TI-Net: Transferred Initialization with Modified End Layer for
COVID-19 Diagnosis [5.546855806629448]
Transfer learning is a relatively new learning method that has been employed in many sectors to achieve good performance with fewer computations.
In this research, the PyTorch pre-trained models (VGG19_bn and WideResNet -101) are applied in the MNIST dataset.
The proposed model is developed and verified in the Kaggle notebook, and it reached the outstanding accuracy of 99.77% without taking a huge computational time.
arXiv Detail & Related papers (2022-09-20T08:52:52Z) - Imposing Consistency for Optical Flow Estimation [73.53204596544472]
Imposing consistency through proxy tasks has been shown to enhance data-driven learning.
This paper introduces novel and effective consistency strategies for optical flow estimation.
arXiv Detail & Related papers (2022-04-14T22:58:30Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z) - Selective Output Smoothing Regularization: Regularize Neural Networks by
Softening Output Distributions [5.725228891050467]
We propose Selective Output Smoothing Regularization, a novel regularization method for training the Convolutional Neural Networks (CNNs)
Inspired by the diverse effects on training from different samples, Selective Output Smoothing Regularization improves the performance by encouraging the model to produce equal logits on incorrect classes.
This plug-and-play regularization method can be conveniently incorporated into almost any CNN-based project without extra hassle.
arXiv Detail & Related papers (2021-03-29T07:21:06Z) - Learning to Learn Parameterized Classification Networks for Scalable
Input Images [76.44375136492827]
Convolutional Neural Networks (CNNs) do not have a predictable recognition behavior with respect to the input resolution change.
We employ meta learners to generate convolutional weights of main networks for various input scales.
We further utilize knowledge distillation on the fly over model predictions based on different input resolutions.
arXiv Detail & Related papers (2020-07-13T04:27:25Z) - Adaptive Risk Minimization: Learning to Adapt to Domain Shift [109.87561509436016]
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
In this work, we consider the problem setting of domain generalization, where the training data are structured into domains and there may be multiple test time shifts.
We introduce the framework of adaptive risk minimization (ARM), in which models are directly optimized for effective adaptation to shift by learning to adapt on the training domains.
arXiv Detail & Related papers (2020-07-06T17:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.