The Active and Noise-Tolerant Strategic Perceptron
- URL: http://arxiv.org/abs/2512.01783v2
- Date: Tue, 02 Dec 2025 03:18:15 GMT
- Title: The Active and Noise-Tolerant Strategic Perceptron
- Authors: Maria-Florina Balcan, Hedyeh Beyhaghi,
- Abstract summary: We show that a modified version of the Active Perceptron algorithm [DKM05,YZ17] achieves excess error $$ using only $tildeO(d ln frac1)$ label queries.<n>The algorithm is computationally efficient and, under these distributional assumptions, requires substantially fewer label queries than prior work on strategic Perceptron.
- Score: 18.37713639105532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We initiate the study of active learning algorithms for classifying strategic agents. Active learning is a well-established framework in machine learning in which the learner selectively queries labels, often achieving substantially higher accuracy and efficiency than classical supervised methods-especially in settings where labeling is costly or time-consuming, such as hiring, admissions, and loan decisions. Strategic classification, however, addresses scenarios where agents modify their features to obtain more favorable outcomes, resulting in observed data that is not truthful. Such manipulation introduces challenges beyond those in learning from clean data. Our goal is to design active and noise-tolerant algorithms that remain effective in strategic environments-algorithms that classify strategic agents accurately while issuing as few label requests as possible. The central difficulty is to simultaneously account for strategic manipulation and preserve the efficiency gains of active learning. Our main result is an algorithm for actively learning linear separators in the strategic setting that preserves the exponential improvement in label complexity over passive learning previously obtained only in the non-strategic case. Specifically, for data drawn uniformly from the unit sphere, we show that a modified version of the Active Perceptron algorithm [DKM05,YZ17] achieves excess error $ε$ using only $\tilde{O}(d \ln \frac{1}ε)$ label queries and incurs at most $\tilde{O}(d \ln \frac{1}ε)$ additional mistakes relative to the optimal classifier, even in the nonrealizable case, when a $\tildeΩ(ε)$ fraction of inputs have inconsistent labels with the optimal classifier. The algorithm is computationally efficient and, under these distributional assumptions, requires substantially fewer label queries than prior work on strategic Perceptron [ABBN21].
Related papers
- GLiClass: Generalist Lightweight Model for Sequence Classification Tasks [49.2639069781367]
We propose GLiClass, a novel method that adapts the GLiNER architecture for sequence classification tasks.<n>Our approach achieves strong accuracy and efficiency comparable to embedding-based methods, while maintaining the flexibility needed for zero-shot and few-shot learning scenarios.
arXiv Detail & Related papers (2025-08-11T06:22:25Z) - Probably Approximately Precision and Recall Learning [60.00180898830079]
A key challenge in machine learning is the prevalence of one-sided feedback.<n>We introduce a Probably Approximately Correct (PAC) framework in which hypotheses are set functions that map each input to a set of labels.<n>We develop new algorithms that learn from positive data alone, achieving optimal sample complexity in the realizable case.
arXiv Detail & Related papers (2024-11-20T04:21:07Z) - Minimax Group Fairness in Strategic Classification [8.250258160056514]
In strategic classification, agents manipulate their features, at a cost, to receive a positive classification outcome from the learner's classifier.
We consider learning objectives that have group fairness guarantees in addition to accuracy guarantees.
We formalize a fairness-aware Stackelberg game between a population of agents consisting of several groups, with each group having its own cost function.
arXiv Detail & Related papers (2024-10-03T14:22:55Z) - Unleashing the Potential of Regularization Strategies in Learning with
Noisy Labels [65.92994348757743]
We demonstrate that a simple baseline using cross-entropy loss, combined with widely used regularization strategies can outperform state-of-the-art methods.
Our findings suggest that employing a combination of regularization strategies can be more effective than intricate algorithms in tackling the challenges of learning with noisy labels.
arXiv Detail & Related papers (2023-07-11T05:58:20Z) - Algorithm Selection for Deep Active Learning with Imbalanced Datasets [11.902019233549474]
Active learning aims to reduce the number of labeled examples needed to train deep networks.
It is difficult to know in advance which active learning strategy will perform well or best in a given application.
We propose the first adaptive algorithm selection strategy for deep active learning.
arXiv Detail & Related papers (2023-02-14T19:59:49Z) - An Efficient Active Learning Pipeline for Legal Text Classification [2.462514989381979]
We propose a pipeline for effectively using active learning with pre-trained language models in the legal domain.
We use knowledge distillation to guide the model's embeddings to a semantically meaningful space.
Our experiments on Contract-NLI, adapted to the classification task, and LEDGAR benchmarks show that our approach outperforms standard AL strategies.
arXiv Detail & Related papers (2022-11-15T13:07:02Z) - Efficient Active Learning with Abstention [12.315392649501101]
We develop the first computationally efficient active learning algorithm with abstention.
A key feature of the algorithm is that it avoids the undesirable "noise-seeking" behavior often seen in active learning.
arXiv Detail & Related papers (2022-03-31T18:34:57Z) - Low Budget Active Learning via Wasserstein Distance: An Integer
Programming Approach [81.19737119343438]
Active learning is the process of training a model with limited labeled data by selecting a core subset of an unlabeled data pool to label.
We propose a new integer optimization problem for selecting a core set that minimizes the discrete Wasserstein distance from the unlabeled pool.
Our strategy requires high-quality latent features which we obtain by unsupervised learning on the unlabeled pool.
arXiv Detail & Related papers (2021-06-05T21:25:03Z) - Semi-supervised Batch Active Learning via Bilevel Optimization [89.37476066973336]
We formulate our approach as a data summarization problem via bilevel optimization.
We show that our method is highly effective in keyword detection tasks in the regime when only few labeled samples are available.
arXiv Detail & Related papers (2020-10-19T16:53:24Z) - Meta-learning with Stochastic Linear Bandits [120.43000970418939]
We consider a class of bandit algorithms that implement a regularized version of the well-known OFUL algorithm, where the regularization is a square euclidean distance to a bias vector.
We show both theoretically and experimentally, that when the number of tasks grows and the variance of the task-distribution is small, our strategies have a significant advantage over learning the tasks in isolation.
arXiv Detail & Related papers (2020-05-18T08:41:39Z) - Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label.
Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data.
This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z) - Fase-AL -- Adaptation of Fast Adaptive Stacking of Ensembles for
Supporting Active Learning [0.0]
This work presents the FASE-AL algorithm which induces classification models with non-labeled instances using Active Learning.
The algorithm achieves promising results in terms of the percentage of correctly classified instances.
arXiv Detail & Related papers (2020-01-30T17:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.