Observation-Free Attacks on Online Learning to Rank
- URL: http://arxiv.org/abs/2509.22855v2
- Date: Fri, 03 Oct 2025 05:37:46 GMT
- Title: Observation-Free Attacks on Online Learning to Rank
- Authors: Sameep Chattopadhyay, Nikhil Karamchandani, Sharayu Moharir,
- Abstract summary: We present a novel framework for attacking some of the widely used online learning algorithms.<n>Our framework is designed to promote a set of target items so that they appear in the list of top-K recommendations for T - o(T) rounds, while simultaneously inducing linear regret in the learning algorithm.
- Score: 5.004981323118799
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online learning to rank (OLTR) plays a critical role in information retrieval and machine learning systems, with a wide range of applications in search engines and content recommenders. However, despite their extensive adoption, the susceptibility of OLTR algorithms to coordinated adversarial attacks remains poorly understood. In this work, we present a novel framework for attacking some of the widely used OLTR algorithms. Our framework is designed to promote a set of target items so that they appear in the list of top-K recommendations for T - o(T) rounds, while simultaneously inducing linear regret in the learning algorithm. We propose two novel attack strategies: CascadeOFA for CascadeUCB1 and PBMOFA for PBM-UCB . We provide theoretical guarantees showing that both strategies require only O(log T) manipulations to succeed. Additionally, we supplement our theoretical analysis with empirical results on real-world data.
Related papers
- Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning [28.999963907637188]
This paper systematically reviews widely adoptedReinforcement learning techniques.<n>We present clear guidelines for selecting RL techniques tailored to specific setups.<n>We also reveal that a minimalist combination of two techniques can unlock the learning capability of critic-free policies.
arXiv Detail & Related papers (2025-08-11T17:39:45Z) - Discriminative Adversarial Unlearning [40.30974185546541]
We introduce a novel machine unlearning framework founded upon the established principles of the min-max optimization paradigm.
We capitalize on the capabilities of strong Membership Inference Attacks (MIA) to facilitate the unlearning of specific samples from a trained model.
Our proposed algorithm closely approximates the ideal benchmark of retraining from scratch for both random sample forgetting and class-wise forgetting schemes.
arXiv Detail & Related papers (2024-02-10T03:04:57Z) - Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks.<n>This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks.<n>We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Adversarial Attacks on Online Learning to Rank with Stochastic Click
Models [34.725468803108754]
We propose the first study of adversarial attacks on online learning to rank.
The goal of the adversary is to misguide the online learning to rank algorithm to place the target item on top of the ranking list linear times to time horizon $T$ with a sublinear attack cost.
arXiv Detail & Related papers (2023-05-30T17:05:49Z) - Adversarial Attacks on Online Learning to Rank with Click Feedback [18.614785011987756]
Online learning to rank is a sequential decision-making problem where a learning agent selects an ordered list of items and receives feedback through user clicks.
This paper studies attack strategies against multiple variants of OLTR.
We propose a general attack strategy against any algorithm under the general click model.
arXiv Detail & Related papers (2023-05-26T16:28:26Z) - Adversarial Training with Complementary Labels: On the Benefit of
Gradually Informative Attacks [119.38992029332883]
Adversarial training with imperfect supervision is significant but receives limited attention.
We propose a new learning strategy using gradually informative attacks.
Experiments are conducted to demonstrate the effectiveness of our method on a range of benchmarked datasets.
arXiv Detail & Related papers (2022-11-01T04:26:45Z) - Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL)
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions.
Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z) - Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning [6.414910263179327]
We study reward poisoning attacks on online deep reinforcement learning (DRL)
We demonstrate the intrinsic vulnerability of state-of-the-art DRL algorithms by designing a general, black-box reward poisoning framework called adversarial MDP attacks.
Our results show that our attacks efficiently poison agents learning in several popular classical control and MuJoCo environments.
arXiv Detail & Related papers (2022-05-30T04:07:19Z) - Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases.
We first rigorously analyze a deterministic variant of the online threat model.
We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.