Distributionally Robust Classification on a Data Budget
- URL: http://arxiv.org/abs/2308.03821v1
- Date: Mon, 7 Aug 2023 15:30:02 GMT
- Title: Distributionally Robust Classification on a Data Budget
- Authors: Benjamin Feuer, Ameya Joshi, Minh Pham, Chinmay Hegde
- Abstract summary: We show that standard ResNet-50 trained with the cross-entropy loss on 2.4 million image samples can attain comparable robustness to a CLIP ResNet-50 trained on 400 million samples.
This is the first result showing (near) state-of-the-art distributional robustness on limited data budgets.
- Score: 26.69877485937123
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real world uses of deep learning require predictable model behavior under
distribution shifts. Models such as CLIP show emergent natural distributional
robustness comparable to humans, but may require hundreds of millions of
training samples. Can we train robust learners in a domain where data is
limited? To rigorously address this question, we introduce JANuS (Joint
Annotations and Names Set), a collection of four new training datasets with
images, labels, and corresponding captions, and perform a series of carefully
controlled investigations of factors contributing to robustness in image
classification, then compare those results to findings derived from a
large-scale meta-analysis. Using this approach, we show that standard ResNet-50
trained with the cross-entropy loss on 2.4 million image samples can attain
comparable robustness to a CLIP ResNet-50 trained on 400 million samples. To
our knowledge, this is the first result showing (near) state-of-the-art
distributional robustness on limited data budgets. Our dataset is available at
\url{https://huggingface.co/datasets/penfever/JANuS_dataset}, and the code used
to reproduce our experiments can be found at
\url{https://github.com/penfever/vlhub/}.
Related papers
- Dataset Quantization [72.61936019738076]
We present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets.
DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio.
arXiv Detail & Related papers (2023-08-21T07:24:29Z) - DatasetEquity: Are All Samples Created Equal? In The Quest For Equity
Within Datasets [4.833815605196965]
This paper presents a novel method for addressing data imbalance in machine learning.
It computes sample likelihoods based on image appearance using deep perceptual embeddings and clustering.
It then uses these likelihoods to weigh samples differently during training with a proposed $bfGeneralized Focal Loss$ function.
arXiv Detail & Related papers (2023-08-19T02:11:49Z) - On the Connection between Pre-training Data Diversity and Fine-tuning
Robustness [66.30369048726145]
We find that the primary factor influencing downstream effective robustness is data quantity.
We demonstrate our findings on pre-training distributions drawn from various natural and synthetic data sources.
arXiv Detail & Related papers (2023-07-24T05:36:19Z) - Delving Deeper into Data Scaling in Masked Image Modeling [145.36501330782357]
We conduct an empirical study on the scaling capability of masked image modeling (MIM) methods for visual recognition.
Specifically, we utilize the web-collected Coyo-700M dataset.
Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models.
arXiv Detail & Related papers (2023-05-24T15:33:46Z) - Incorporating Crowdsourced Annotator Distributions into Ensemble
Modeling to Improve Classification Trustworthiness for Ancient Greek Papyri [3.870354915766567]
Two issues which complicate the problem on such datasets are class imbalance and ground-truth uncertainty in labeling.
The application of ensemble modeling to such datasets can help identify images where the ground-truth is questionable and quantify the trustworthiness of those samples.
arXiv Detail & Related papers (2022-10-28T19:39:14Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - GDC- Generalized Distribution Calibration for Few-Shot Learning [5.076419064097734]
Few shot learning is an important problem in machine learning as large labelled datasets take considerable time and effort to assemble.
Most few-shot learning algorithms suffer from one of two limitations- they either require the design of sophisticated models and loss functions, thus hampering interpretability.
We propose a Generalized sampling method that learns to estimate few-shot distributions for classification as weighted random variables of all large classes.
arXiv Detail & Related papers (2022-04-11T16:22:53Z) - KNN-Diffusion: Image Generation via Large-Scale Retrieval [40.6656651653888]
Learning to adapt enables several new capabilities.
Fine-tuning trained models to new samples can be achieved by simply adding them to the table.
Our diffusion-based model trains on images only, by leveraging a joint Text-Image multi-modal metric.
arXiv Detail & Related papers (2022-04-06T14:13:35Z) - BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch.
BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training.
We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z) - Self-Supervised Pre-Training for Transformer-Based Person
Re-Identification [54.55281692768765]
Transformer-based supervised pre-training achieves great performance in person re-identification (ReID)
Due to the domain gap between ImageNet and ReID datasets, it usually needs a larger pre-training dataset to boost the performance.
This work aims to mitigate the gap between the pre-training and ReID datasets from the perspective of data and model structure.
arXiv Detail & Related papers (2021-11-23T18:59:08Z) - Feature Generation for Long-tail Classification [36.186909933006675]
We show how to generate meaningful features by estimating the tail category's distribution.
We also present a qualitative analysis of generated features using t-SNE visualizations and analyze the nearest neighbors used to calibrate the tail class distributions.
arXiv Detail & Related papers (2021-11-10T21:34:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.