Related papers: Attack-Resistant Uniform Fairness for Linear and Smooth Contextual Bandits

Attack-Resistant Uniform Fairness for Linear and Smooth Contextual Bandits

URL: http://arxiv.org/abs/2602.04125v1
Date: Wed, 04 Feb 2026 01:37:47 GMT
Title: Attack-Resistant Uniform Fairness for Linear and Smooth Contextual Bandits
Authors: Qingwen Zhang, Wenjia Wang,
Abstract summary: We develop novel algorithms that achieve minimax-optimal regret for both linear and smooth reward functions.<n>We show that an adversary with a minimal $tildeO(1)$ budget can not only degrade overall performance as in traditional attacks, but also selectively induce insidious fairness-specific failures.
Score: 21.090584232258056
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern systems, such as digital platforms and service systems, increasingly rely on contextual bandits for online decision-making; however, their deployment can inadvertently create unfair exposure among arms, undermining long-term platform sustainability and supplier trust. This paper studies the contextual bandit problem under a uniform $(1-δ)$-fairness constraint, and addresses its unique vulnerabilities to strategic manipulation. The fairness constraint ensures that preferential treatment is strictly justified by an arm's actual reward across all contexts and time horizons, using uniformity to prevent statistical loopholes. We develop novel algorithms that achieve (nearly) minimax-optimal regret for both linear and smooth reward functions, while maintaining strong $(1-\tilde{O}(1/T))$-fairness guarantees, and further characterize the theoretically inherent yet asymptotically marginal "price of fairness". However, we reveal that such merit-based fairness becomes uniquely susceptible to signal manipulation. We show that an adversary with a minimal $\tilde{O}(1)$ budget can not only degrade overall performance as in traditional attacks, but also selectively induce insidious fairness-specific failures while leaving conspicuous regret measures largely unaffected. To counter this, we design robust variants incorporating corruption-adaptive exploration and error-compensated thresholding. Our approach yields the first minimax-optimal regret bounds under $C$-budgeted attack while preserving $(1-\tilde{O}(1/T))$-fairness. Numerical experiments and a real-world case demonstrate that our algorithms sustain both fairness and efficiency.

Related papers

Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update [70.38810219913593]
We study the generalized linear bandit (GLB) problem, a contextual multi-armed bandit framework that extends the classical linear model by incorporating a non-linear link function.<n>GLBs are widely applicable to real-world scenarios, but their non-linear nature introduces significant challenges in achieving both computational and statistical efficiency.<n>We propose a jointly efficient algorithm that attains a nearly optimal regret bound with $mathcalO(1)$ time and space complexities per round.
arXiv Detail & Related papers (2025-07-16T02:24:21Z)
A Flexible Fairness Framework with Surrogate Loss Reweighting for Addressing Sociodemographic Disparities [2.057770398219001]
This paper presents a new algorithmic framework called $boldsymbolalpha$boldbeta$ Fair Machine Learning ($symbolalphasymbolsymbolbetabeta$ FML)<n>Our framework employs a new surrogate loss minimization, paired with loss reweighting, allowing precise accuracy trade-offs through tunable attributes.
arXiv Detail & Related papers (2025-03-21T04:10:14Z)
Consistent End-to-End Estimation for Counterfactual Fairness [56.9060492313073]
We propose a novel counterfactual fairness predictor for making predictions under counterfactual fairness.<n>We provide theoretical guarantees that our method is effective in ensuring the notion of counterfactual fairness.
arXiv Detail & Related papers (2023-10-26T17:58:39Z)
On the Vulnerability of Fairness Constrained Learning to Malicious Noise [28.176039923404883]
We consider the vulnerability of fairness-constrained learning to small amounts of malicious noise in the training data. For example, for Demographic Parity we show we can incur only a $Theta(alpha)$ loss in accuracy, where $alpha$ is the malicious noise rate. For Equal Opportunity, we show we can incur an $O(sqrtalpha)$ loss, and give a matching $Omega(sqrtalpha)$ lower bound.
arXiv Detail & Related papers (2023-07-21T20:26:54Z)
Robust Lipschitz Bandits to Adversarial Corruptions [61.85150061213987]
Lipschitz bandit is a variant of bandits that deals with a continuous arm set defined on a metric space. In this paper, we introduce a new problem of Lipschitz bandits in the presence of adversarial corruptions. Our work presents the first line of robust Lipschitz bandit algorithms that can achieve sub-linear regret under both types of adversary.
arXiv Detail & Related papers (2023-05-29T18:16:59Z)
Autoregressive Bandits [58.46584210388307]
We propose a novel online learning setting, Autoregressive Bandits, in which the observed reward is governed by an autoregressive process of order $k$. We show that, under mild assumptions on the reward process, the optimal policy can be conveniently computed. We then devise a new optimistic regret minimization algorithm, namely, AutoRegressive Upper Confidence Bound (AR-UCB), that suffers sublinear regret of order $widetildemathcalO left( frac(k+1)3/2sqrtnT (1-G
arXiv Detail & Related papers (2022-12-12T21:37:36Z)
SLIDE: a surrogate fairness constraint to ensure fairness consistency [1.3649494534428745]
We propose a new surrogate fairness constraint called SLIDE, which is feasible and achieves a fast convergence rate. Numerical experiments confirm that SLIDE works well for various benchmark datasets.
arXiv Detail & Related papers (2022-02-07T13:50:21Z)
Towards Transferable Unrestricted Adversarial Examples with Minimum Changes [13.75751221823941]
Transfer-based adversarial example is one of the most important classes of black-box attacks. There is a trade-off between transferability and imperceptibility of the adversarial perturbation. We propose a geometry-aware framework to generate transferable adversarial examples with minimum changes.
arXiv Detail & Related papers (2022-01-04T12:03:20Z)
Certifiably Robust Interpretation via Renyi Differential Privacy [77.04377192920741]
We study the problem of interpretation robustness from a new perspective of Renyi differential privacy (RDP) First, it can offer provable and certifiable top-$k$ robustness. Second, our proposed method offers $sim10%$ better experimental robustness than existing approaches. Third, our method can provide a smooth tradeoff between robustness and computational efficiency.
arXiv Detail & Related papers (2021-07-04T06:58:01Z)
A Regret bound for Non-stationary Multi-Armed Bandits with Fairness Constraints [7.716156977428555]
We present a new algorithm called Fair Upper Confidence Bound with Exploration Fair-UCBe for solving a slowly varying $k$-armed bandit problem. We show that the performance of our algorithm in the non-stationary case approaches that of its stationary counterpart tends to zero.
arXiv Detail & Related papers (2020-12-24T18:12:01Z)
On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm. We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z)
Robustness Guarantees for Mode Estimation with an Application to Bandits [131.21717367564963]
We introduce a theory for multi-armed bandits where the values are the modes of the reward distributions instead of the mean. We show in simulations that our algorithms are robust to perturbation of the arms by adversarial noise sequences.
arXiv Detail & Related papers (2020-03-05T21:29:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.