Related papers: Tolerant Algorithms for Learning with Arbitrary Covariate Shift

Tolerant Algorithms for Learning with Arbitrary Covariate Shift

URL: http://arxiv.org/abs/2406.02742v1
Date: Tue, 4 Jun 2024 19:50:05 GMT
Title: Tolerant Algorithms for Learning with Arbitrary Covariate Shift
Authors: Surbhi Goel, Abhishek Shetty, Konstantinos Stavropoulos, Arsen Vasilyan,
Abstract summary: We study the problem of learning under arbitrary distribution shift, where the learner is trained on a labeled set from one distribution but evaluated on a different, potentially adversarially generated test distribution. We focus on two frameworks: PQ learning [Goldwasser, A. Kalai, Y. Kalai, Montasser NeurIPS 2020], allowing abstention on adversarially generated parts of the test distribution, and TDS learning [Klivans, Stavropoulos, Vasilyan COLT 2024], permitting abstention on the entire test distribution if distribution shift is detected.
Score: 18.37181965815327
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the problem of learning under arbitrary distribution shift, where the learner is trained on a labeled set from one distribution but evaluated on a different, potentially adversarially generated test distribution. We focus on two frameworks: PQ learning [Goldwasser, A. Kalai, Y. Kalai, Montasser NeurIPS 2020], allowing abstention on adversarially generated parts of the test distribution, and TDS learning [Klivans, Stavropoulos, Vasilyan COLT 2024], permitting abstention on the entire test distribution if distribution shift is detected. All prior known algorithms either rely on learning primitives that are computationally hard even for simple function classes, or end up abstaining entirely even in the presence of a tiny amount of distribution shift. We address both these challenges for natural function classes, including intersections of halfspaces and decision trees, and standard training distributions, including Gaussians. For PQ learning, we give efficient learning algorithms, while for TDS learning, our algorithms can tolerate moderate amounts of distribution shift. At the core of our approach is an improved analysis of spectral outlier-removal techniques from learning with nasty noise. Our analysis can (1) handle arbitrarily large fraction of outliers, which is crucial for handling arbitrary distribution shifts, and (2) obtain stronger bounds on polynomial moments of the distribution after outlier removal, yielding new insights into polynomial regression under distribution shifts. Lastly, our techniques lead to novel results for tolerant testable learning [Rubinfeld and Vasilyan STOC 2023], and learning with nasty noise.

Related papers

Learning with Positive and Imperfect Unlabeled Data [7.04316974339151]
We study the problem of learning binary classifiers from positive and unlabeled data when the unlabeled data distribution is shifted. Our main results on PIU learning are the characterizations of the sample complexity of PIU learning and a computationally and sample-efficient algorithm achieving a misclassification error.
arXiv Detail & Related papers (2025-04-14T17:19:29Z)
Learning Neural Networks with Distribution Shift: Efficiently Certifiable Guarantees [13.936051653540144]
We give the first. efficient algorithms for learning neural networks with a distribution shift. We work in the Testable Learning with Distribution Shift framework.
arXiv Detail & Related papers (2025-02-22T00:48:03Z)
Efficient Discrepancy Testing for Learning with Distribution Shift [17.472049019016524]
We provide the first set of provably efficient algorithms for testing localized discrepancy distance. Results imply a broad set of new, efficient learning algorithms in the recently introduced model of Testable Learning with Distribution Shift.
arXiv Detail & Related papers (2024-06-13T17:51:10Z)
Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples. Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance. We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z)
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation [121.0693322732454]
Contrastive Language-Image Pretraining (CLIP) has gained popularity for its remarkable zero-shot capacity. Recent research has focused on developing efficient fine-tuning methods to enhance CLIP's performance in downstream tasks. We revisit a classical algorithm, Gaussian Discriminant Analysis (GDA), and apply it to the downstream classification of CLIP.
arXiv Detail & Related papers (2024-02-06T15:45:27Z)
An Efficient Tester-Learner for Halfspaces [13.13131953359806]
We give the first efficient algorithm for learning halfspaces in the testable learning model recently defined by Rubinfeld and Vasilyan (2023) In this model, a learner certifies that the accuracy of its output hypothesis is near optimal whenever the training set passes a test.
arXiv Detail & Related papers (2023-02-28T18:51:55Z)
A Moment-Matching Approach to Testable Learning and a New Characterization of Rademacher Complexity [15.746613321517282]
We give a powerful new approach for developing algorithms for testable learning using tools from moment matching and metric agnostic in probability. Surprisingly, we show that the information-theoretic complexity of testable learning is tightly characterized by the Rademacher complexity of the concept class.
arXiv Detail & Related papers (2022-11-23T21:29:51Z)
Learnable Distribution Calibration for Few-Shot Class-Incremental Learning [122.2241120474278]
Few-shot class-incremental learning (FSCIL) faces challenges of memorizing old class distributions and estimating new class distributions given few training samples. We propose a learnable distribution calibration (LDC) approach, with the aim to systematically solve these two challenges using a unified framework.
arXiv Detail & Related papers (2022-10-01T09:40:26Z)
Self-Supervised Learning by Estimating Twin Class Distributions [26.7828253129684]
We present TWIST, a novel self-supervised representation learning method by classifying large-scale unlabeled datasets in an end-to-end way. We employ a siamese network terminated by a softmax operation to produce twin class distributions of two augmented images. Specifically, we minimize the entropy of the distribution for each sample to make the class prediction for each sample and maximize the entropy of the mean distribution to make the predictions of different samples diverse.
arXiv Detail & Related papers (2021-10-14T14:39:39Z)
A Polynomial Time Algorithm for Learning Halfspaces with Tsybakov Noise [55.45544638410858]
We study the problem of PAC learning homogeneous halfspaces in the presence of Tsybakov noise. Our algorithm learns the true halfspace within any accuracy $epsilon$.
arXiv Detail & Related papers (2020-10-04T22:19:06Z)
Distributional Reinforcement Learning via Moment Matching [54.16108052278444]
We formulate a method that learns a finite set of statistics from each return distribution via neural networks. Our method can be interpreted as implicitly matching all orders of moments between a return distribution and its Bellman target. Experiments on the suite of Atari games show that our method outperforms the standard distributional RL baselines.
arXiv Detail & Related papers (2020-07-24T05:18:17Z)
Estimates on Learning Rates for Multi-Penalty Distribution Regression [5.999239529678357]
We study a multi-penalty regularization algorithm for distribution regression under the framework of learning theory. We embed the distributions to reproducing a kernel Hilbert space $mathcalH_K$ associated with Mercer kernel $K$ via mean embedding technique. The work also derives learning rates for distribution regression in the nonstandard setting $f_rhonotinmathcalH_K$, which is not explored in existing literature.
arXiv Detail & Related papers (2020-06-16T09:31:58Z)
Meta-learning with Stochastic Linear Bandits [120.43000970418939]
We consider a class of bandit algorithms that implement a regularized version of the well-known OFUL algorithm, where the regularization is a square euclidean distance to a bias vector. We show both theoretically and experimentally, that when the number of tasks grows and the variance of the task-distribution is small, our strategies have a significant advantage over learning the tasks in isolation.
arXiv Detail & Related papers (2020-05-18T08:41:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.