Related papers: OTTER: Improving Zero-Shot Classification via Optimal Transport

OTTER: Improving Zero-Shot Classification via Optimal Transport

URL: http://arxiv.org/abs/2404.08461v1
Date: Fri, 12 Apr 2024 13:18:47 GMT
Title: OTTER: Improving Zero-Shot Classification via Optimal Transport
Authors: Changho Shin, Jitian Zhao, Sonia Cromp, Harit Vishwakarma, Frederic Sala,
Abstract summary: We introduce a simple and lightweight approach to adjust pretrained model predictions via optimal transport. We validate our method in a wide array of zero-shot image and text classification tasks, improving accuracy by 4.8% and 15.9% on average.
Score: 13.789436156370893
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Popular zero-shot models suffer due to artifacts inherited from pretraining. A particularly detrimental artifact, caused by unbalanced web-scale pretraining data, is mismatched label distribution. Existing approaches that seek to repair the label distribution are not suitable in zero-shot settings, as they have incompatible requirements such as access to labeled downstream task data or knowledge of the true label balance in the pretraining distribution. We sidestep these challenges and introduce a simple and lightweight approach to adjust pretrained model predictions via optimal transport. Our technique requires only an estimate of the label distribution of a downstream task. Theoretically, we characterize the improvement produced by our procedure under certain mild conditions and provide bounds on the error caused by misspecification. Empirically, we validate our method in a wide array of zero-shot image and text classification tasks, improving accuracy by 4.8% and 15.9% on average, and beating baselines like Prior Matching -- often by significant margins -- in 17 out of 21 datasets.

Related papers

Practical estimation of the optimal classification error with soft labels and calibration [52.1410307583181]
We extend a previous work that utilizes soft labels for estimating the Bayes error, the optimal error rate.<n>We tackle a more challenging problem setting: estimation with corrupted soft labels.<n>Our method is instance-free, i.e., we do not assume access to any input instances.
arXiv Detail & Related papers (2025-05-27T06:04:57Z)
Early Stopping Against Label Noise Without Validation Data [54.27621957395026]
We propose a novel early stopping method called Label Wave, which does not require validation data for selecting the desired model. We show both the effectiveness of the Label Wave method across various settings and its capability to enhance the performance of existing methods for learning with noisy labels.
arXiv Detail & Related papers (2025-02-11T13:40:15Z)
Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting [55.361337202198925]
Vision-language models, such as CLIP, have shown impressive generalization capacities when using appropriate text descriptions. We propose a label-Free prompt distribution learning and bias correction framework, dubbed as **Frolic**, which boosts zero-shot performance without the need for labeled data.
arXiv Detail & Related papers (2024-10-25T04:00:45Z)
Efficient Online Set-valued Classification with Bandit Feedback [10.882001129426726]
We propose Bandit Class-specific Conformal Prediction (BCCP), offering coverage guarantees on a class-specific granularity. BCCP overcomes the challenges of sparsely labeled data in each iteration and generalizes the reliability and applicability of conformal prediction to online decision-making environments.
arXiv Detail & Related papers (2024-05-07T15:14:51Z)
Dist-PU: Positive-Unlabeled Learning from a Label Distribution Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper. Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions. Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z)
Improving Zero-Shot Models with Label Distribution Priors [33.51714665243138]
We propose a new approach, CLIPPR, which adapts zero-shot models for regression and classification on unlabelled datasets. We demonstrate an improvement of 28% in mean absolute error on the UTK age regression task. We also present promising results for classification benchmarks, improving the classification accuracy on the ImageNet dataset by 2.83%, without using any labels.
arXiv Detail & Related papers (2022-12-01T18:59:03Z)
FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning [46.95063831057502]
We propose emphFreeMatch to define and adjust the confidence threshold in a self-adaptive manner according to the model's learning status. FreeMatch achieves textbf5.78%, textbf13.59%, and textbf1.28% error rate reduction over the latest state-of-the-art method FlexMatch on CIFAR-10 with 1 label per class.
arXiv Detail & Related papers (2022-05-15T10:07:52Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Debiased Learning from Naturally Imbalanced Pseudo-Labels for Zero-Shot and Semi-Supervised Learning [27.770473405635585]
This work studies the bias issue of pseudo-labeling, a natural phenomenon that widely occurs but often overlooked by prior research. We observe heavy long-tailed pseudo-labels when a semi-supervised learning model FixMatch predicts labels on the unlabeled set even though the unlabeled data is curated to be balanced. Without intervention, the training model inherits the bias from the pseudo-labels and end up being sub-optimal.
arXiv Detail & Related papers (2022-01-05T07:40:24Z)
Distribution-free uncertainty quantification for classification under label shift [105.27463615756733]
We focus on uncertainty quantification (UQ) for classification problems via two avenues. We first argue that label shift hurts UQ, by showing degradation in coverage and calibration. We examine these techniques theoretically in a distribution-free framework and demonstrate their excellent practical performance.
arXiv Detail & Related papers (2021-03-04T20:51:03Z)
Self-Supervised Noisy Label Learning for Source-Free Unsupervised Domain Adaptation [87.60688582088194]
We propose a novel Self-Supervised Noisy Label Learning method. Our method can easily achieve state-of-the-art results and surpass other methods by a very large margin.
arXiv Detail & Related papers (2021-02-23T10:51:45Z)
Improving Generalization of Deep Fault Detection Models in the Presence of Mislabeled Data [1.3535770763481902]
We propose a novel two-step framework for robust training with label noise. In the first step, we identify outliers (including the mislabeled samples) based on the update in the hypothesis space. In the second step, we propose different approaches to modifying the training data based on the identified outliers and a data augmentation technique.
arXiv Detail & Related papers (2020-09-30T12:33:25Z)
Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck. We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network. We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.