PAC-Learning for Strategic Classification
- URL: http://arxiv.org/abs/2012.03310v3
- Date: Mon, 15 Feb 2021 22:05:13 GMT
- Title: PAC-Learning for Strategic Classification
- Authors: Ravi Sundaram, Anil Vullikanti, Haifeng Xu, Fan Yao
- Abstract summary: We introduce the notion of strategic VC-dimension (SVC) to capture the PAC-learnability in our general strategic setup.
We fully characterize: (1) the statistical learnability of linear classifiers by pinning down its SVC; (2) its computational tractability by pinning down the complexity of the empirical risk minimization problem.
- Score: 18.60310393343275
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The study of strategic or adversarial manipulation of testing data to fool a
classifier has attracted much recent attention. Most previous works have
focused on two extreme situations where any testing data point either is
completely adversarial or always equally prefers the positive label. In this
paper, we generalize both of these through a unified framework for strategic
classification, and introduce the notion of strategic VC-dimension (SVC) to
capture the PAC-learnability in our general strategic setup. SVC provably
generalizes the recent concept of adversarial VC-dimension (AVC) introduced by
Cullina et al. arXiv:1806.01471. We instantiate our framework for the
fundamental strategic linear classification problem. We fully characterize: (1)
the statistical learnability of linear classifiers by pinning down its SVC; (2)
its computational tractability by pinning down the complexity of the empirical
risk minimization problem. Interestingly, the SVC of linear classifiers is
always upper bounded by its standard VC-dimension. This characterization also
strictly generalizes the AVC bound for linear classifiers in arXiv:1806.01471.
Related papers
- Strategic Classification with Non-Linear Classifiers [13.175123810033124]
We show how strategic user behavior manifests under non-linear classifiers.<n>Key finding is that universal approximators are no longer universal once the environment is strategic.
arXiv Detail & Related papers (2025-05-29T13:40:03Z) - A packing lemma for VCN${}_k$-dimension and learning high-dimensional data [1.6114012813668932]
We prove that non-agnostic high-arity PAC learnability implies a high-arity version of the Haussler packing property.<n>This is done by obtaining direct proofs that classic PAC learnability implies classic Haussler packing property.
arXiv Detail & Related papers (2025-05-21T16:03:12Z) - Generalization Analysis for Supervised Contrastive Representation Learning under Non-IID Settings [8.732260277121547]
We provide a generalization analysis for the Contrastive Representation Learning framework under non-$i.d.$ settings.<n>We derive bounds which indicate the required number of samples in each class scales as the logarithm of the covering number of the class of learnable representations associated to that class.<n>Next, we apply our main results to derive excess risk bounds for common function classes such as linear maps and neural networks.
arXiv Detail & Related papers (2025-05-08T04:26:41Z) - Agnostic Smoothed Online Learning [5.167069404528051]
We propose an algorithm to guarantee sublinear regret for smoothed online learning without prior knowledge of $mu$.
R-Cover has adaptive regret $tilde O(sqrtdT/sigma)$ for function classes with dimension $d$, which is optimal up to logarithmic factors.
arXiv Detail & Related papers (2024-10-07T15:25:21Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Memory Consistency Guided Divide-and-Conquer Learning for Generalized
Category Discovery [56.172872410834664]
Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning.
We propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL)
Our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition.
arXiv Detail & Related papers (2024-01-24T09:39:45Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Coping with Change: Learning Invariant and Minimum Sufficient
Representations for Fine-Grained Visual Categorization [26.254072665916155]
Fine-grained visual categorization (FGVC) is a challenging task due to similar visual appearances between various species.
Previous studies assume that the training and test data have the same underlying distributions, and that features extracted by modern backbone architectures remain discriminative and generalize well to unseen test data.
We combine the merits of invariant risk minimization (IRM) and information bottleneck (IB) principle to learn invariant and minimum sufficient (IMS) representations for FGVC.
arXiv Detail & Related papers (2023-06-08T02:45:15Z) - Dynamic Conceptional Contrastive Learning for Generalized Category
Discovery [76.82327473338734]
Generalized category discovery (GCD) aims to automatically cluster partially labeled data.
Unlabeled data contain instances that are not only from known categories of the labeled data but also from novel categories.
One effective way for GCD is applying self-supervised learning to learn discriminate representation for unlabeled data.
We propose a Dynamic Conceptional Contrastive Learning framework, which can effectively improve clustering accuracy.
arXiv Detail & Related papers (2023-03-30T14:04:39Z) - VC Theoretical Explanation of Double Descent [1.52292571922932]
This paper presents VC-theoretical analysis of double descent and shows that it can be fully explained by classical VC generalization bounds.
We illustrate an application of analytic VC-bounds for modeling double descent for classification problems, using empirical results for several learning methods.
arXiv Detail & Related papers (2022-05-31T05:50:02Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z) - Improving Generalization Bounds for VC Classes Using the Hypergeometric
Tail Inversion [3.157455635899373]
We significantly improve the generalization bounds for VC classes by using two main ideas.
First, we consider the hypergeometric inversion tail to obtain a very tight non-uniform distribution-independent risk upper bound for VC classes.
Second, we optimize the ghost sample trick to obtain a further non-negligible gain.
arXiv Detail & Related papers (2021-10-29T19:50:34Z) - Highly Efficient Representation and Active Learning Framework for
Imbalanced Data and its Application to COVID-19 X-Ray Classification [0.7829352305480284]
We propose a highly data-efficient classification and active learning framework for classifying chest X-rays.
It is based on (1) unsupervised representation learning of a Convolutional Neural Network and (2) the Gaussian Process method.
We demonstrate that only $sim 10%$ of the labeled data is needed to reach the accuracy from training all available labels.
arXiv Detail & Related papers (2021-02-25T02:48:59Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.