Related papers: PAC-Learning for Strategic Classification

PAC-Learning for Strategic Classification

URL: http://arxiv.org/abs/2012.03310v3
Date: Mon, 15 Feb 2021 22:05:13 GMT
Title: PAC-Learning for Strategic Classification
Authors: Ravi Sundaram, Anil Vullikanti, Haifeng Xu, Fan Yao
Abstract summary: We introduce the notion of strategic VC-dimension (SVC) to capture the PAC-learnability in our general strategic setup. We fully characterize: (1) the statistical learnability of linear classifiers by pinning down its SVC; (2) its computational tractability by pinning down the complexity of the empirical risk minimization problem.
Score: 18.60310393343275
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The study of strategic or adversarial manipulation of testing data to fool a classifier has attracted much recent attention. Most previous works have focused on two extreme situations where any testing data point either is completely adversarial or always equally prefers the positive label. In this paper, we generalize both of these through a unified framework for strategic classification, and introduce the notion of strategic VC-dimension (SVC) to capture the PAC-learnability in our general strategic setup. SVC provably generalizes the recent concept of adversarial VC-dimension (AVC) introduced by Cullina et al. arXiv:1806.01471. We instantiate our framework for the fundamental strategic linear classification problem. We fully characterize: (1) the statistical learnability of linear classifiers by pinning down its SVC; (2) its computational tractability by pinning down the complexity of the empirical risk minimization problem. Interestingly, the SVC of linear classifiers is always upper bounded by its standard VC-dimension. This characterization also strictly generalizes the AVC bound for linear classifiers in arXiv:1806.01471.

Related papers

Strategic Classification with Non-Linear Classifiers [13.175123810033124]
We show how strategic user behavior manifests under non-linear classifiers.<n>Key finding is that universal approximators are no longer universal once the environment is strategic.
arXiv Detail & Related papers (2025-05-29T13:40:03Z)
A packing lemma for VCN${}_k$-dimension and learning high-dimensional data [1.6114012813668932]
We prove that non-agnostic high-arity PAC learnability implies a high-arity version of the Haussler packing property.<n>This is done by obtaining direct proofs that classic PAC learnability implies classic Haussler packing property.
arXiv Detail & Related papers (2025-05-21T16:03:12Z)
Generalization Analysis for Supervised Contrastive Representation Learning under Non-IID Settings [8.732260277121547]
We provide a generalization analysis for the Contrastive Representation Learning framework under non-$i.d.$ settings.<n>We derive bounds which indicate the required number of samples in each class scales as the logarithm of the covering number of the class of learnable representations associated to that class.<n>Next, we apply our main results to derive excess risk bounds for common function classes such as linear maps and neural networks.
arXiv Detail & Related papers (2025-05-08T04:26:41Z)
Agnostic Smoothed Online Learning [5.167069404528051]
We propose an algorithm to guarantee sublinear regret for smoothed online learning without prior knowledge of $mu$. R-Cover has adaptive regret $tilde O(sqrtdT/sigma)$ for function classes with dimension $d$, which is optimal up to logarithmic factors.
arXiv Detail & Related papers (2024-10-07T15:25:21Z)
Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure. We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z)
Memory Consistency Guided Divide-and-Conquer Learning for Generalized Category Discovery [56.172872410834664]
Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning. We propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL) Our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition.
arXiv Detail & Related papers (2024-01-24T09:39:45Z)
Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data. The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships. A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z)
Coping with Change: Learning Invariant and Minimum Sufficient Representations for Fine-Grained Visual Categorization [26.254072665916155]
Fine-grained visual categorization (FGVC) is a challenging task due to similar visual appearances between various species. Previous studies assume that the training and test data have the same underlying distributions, and that features extracted by modern backbone architectures remain discriminative and generalize well to unseen test data. We combine the merits of invariant risk minimization (IRM) and information bottleneck (IB) principle to learn invariant and minimum sufficient (IMS) representations for FGVC.
arXiv Detail & Related papers (2023-06-08T02:45:15Z)
Dynamic Conceptional Contrastive Learning for Generalized Category Discovery [76.82327473338734]
Generalized category discovery (GCD) aims to automatically cluster partially labeled data. Unlabeled data contain instances that are not only from known categories of the labeled data but also from novel categories. One effective way for GCD is applying self-supervised learning to learn discriminate representation for unlabeled data. We propose a Dynamic Conceptional Contrastive Learning framework, which can effectively improve clustering accuracy.
arXiv Detail & Related papers (2023-03-30T14:04:39Z)
VC Theoretical Explanation of Double Descent [1.52292571922932]
This paper presents VC-theoretical analysis of double descent and shows that it can be fully explained by classical VC generalization bounds. We illustrate an application of analytic VC-bounds for modeling double descent for classification problems, using empirical results for several learning methods.
arXiv Detail & Related papers (2022-05-31T05:50:02Z)
Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels. When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new. In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z)
Improving Generalization Bounds for VC Classes Using the Hypergeometric Tail Inversion [3.157455635899373]
We significantly improve the generalization bounds for VC classes by using two main ideas. First, we consider the hypergeometric inversion tail to obtain a very tight non-uniform distribution-independent risk upper bound for VC classes. Second, we optimize the ghost sample trick to obtain a further non-negligible gain.
arXiv Detail & Related papers (2021-10-29T19:50:34Z)
Highly Efficient Representation and Active Learning Framework for Imbalanced Data and its Application to COVID-19 X-Ray Classification [0.7829352305480284]
We propose a highly data-efficient classification and active learning framework for classifying chest X-rays. It is based on (1) unsupervised representation learning of a Convolutional Neural Network and (2) the Gaussian Process method. We demonstrate that only $sim 10%$ of the labeled data is needed to reach the accuracy from training all available labels.
arXiv Detail & Related papers (2021-02-25T02:48:59Z)
Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification. Our analysis reveals that the classification accuracy is highly distribution-dependent. The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.