Navigating Sparsities in High-Dimensional Linear Contextual Bandits
- URL: http://arxiv.org/abs/2510.08435v1
- Date: Thu, 09 Oct 2025 16:47:14 GMT
- Title: Navigating Sparsities in High-Dimensional Linear Contextual Bandits
- Authors: Rui Zhao, Zihan Chen, Zemin Zheng,
- Abstract summary: High-dimensional linear contextual bandit problems remain a significant challenge due to the curse of dimensionality.<n>A powerful pointwise estimator is introduced in this work that adaptively navigates both kinds of sparsity.<n>Based on this pointwise estimator, a novel algorithm, termed HOPE, is proposed.
- Score: 7.796930035720785
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: High-dimensional linear contextual bandit problems remain a significant challenge due to the curse of dimensionality. Existing methods typically consider either the model parameters to be sparse or the eigenvalues of context covariance matrices to be (approximately) sparse, lacking general applicability due to the rigidity of conventional reward estimators. To overcome this limitation, a powerful pointwise estimator is introduced in this work that adaptively navigates both kinds of sparsity. Based on this pointwise estimator, a novel algorithm, termed HOPE, is proposed. Theoretical analyses demonstrate that HOPE not only achieves improved regret bounds in previously discussed homogeneous settings (i.e., considering only one type of sparsity) but also, for the first time, efficiently handles two new challenging heterogeneous settings (i.e., considering a mixture of two types of sparsity), highlighting its flexibility and generality. Experiments corroborate the superiority of HOPE over existing methods across various scenarios.
Related papers
- Regularized Online RLHF with Generalized Bilinear Preferences [68.44113000390544]
We consider the problem of contextual online RLHF with general preferences.<n>We adopt the Generalized Bilinear Preference Model to capture preferences via low-rank, skew-symmetric matrices.<n>We prove that the dual gap of the greedy policy is bounded by the square of the estimation error.
arXiv Detail & Related papers (2026-02-26T15:27:53Z) - OrthAlign: Orthogonal Subspace Decomposition for Non-Interfering Multi-Objective Alignment [61.02595549125661]
Large language model (LLM) alignment faces a critical dilemma when addressing multiple human preferences.<n>We present OrthAlign, an innovative approach to resolve gradient-level conflicts in preference alignment.<n>We show that OrthAlign achieves maximum single-preference improvements ranging from 34.61% to 50.89% after multiple-objective alignment.
arXiv Detail & Related papers (2025-09-29T11:16:30Z) - Nonconvex Regularization for Feature Selection in Reinforcement Learning [7.408148824204063]
This work proposes an efficient batch algorithm for feature selection in reinforcement learning (RL) with theoretical convergence guarantees.<n> Numerical experiments demonstrate that the proposed approach substantially outperforms state-selection scenarios.
arXiv Detail & Related papers (2025-09-19T06:21:20Z) - Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update [60.414548453838506]
We study the generalized linear bandit (GLB) problem, a contextual multi-armed bandit framework that extends the classical linear model by incorporating a non-linear link function.<n>GLBs are widely applicable to real-world scenarios, but their non-linear nature introduces significant challenges in achieving both computational and statistical efficiency.<n>We propose a jointly efficient algorithm that attains a nearly optimal regret bound with $mathcalO(1)$ time and space complexities per round.
arXiv Detail & Related papers (2025-07-16T02:24:21Z) - Optimization over Sparse Support-Preserving Sets: Two-Step Projection with Global Optimality Guarantees [34.164821598251315]
In sparse optimization, enforcing hard constraints using the $ell_$ pseudo-norm offers advantages like controlled sparsity.<n>Many real-world applications demand not only sparsity constraints but also some extra constraints.<n>We present a new variant of hard-preserving iterative algorithm equipped with a two-step projection operator customized for these mixed constraints.
arXiv Detail & Related papers (2025-06-10T08:27:01Z) - Double Duality: Variational Primal-Dual Policy Optimization for
Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure.
Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z) - Generalization Properties of Adversarial Training for $\ell_0$-Bounded
Adversarial Attacks [47.22918498465056]
In this paper, we aim to theoretically characterize the performance of adversarial training for an important class of neural networks.
Deriving a generalization in this setting has two main challenges.
arXiv Detail & Related papers (2024-02-05T22:57:33Z) - Online Non-convex Optimization with Long-term Non-convex Constraints [7.68925638410622]
A novel Follow-the-Perturbed-Leader type algorithm is proposed and analyzed for solving general long-term constrained optimization problems in an online manner.<n>The proposed is applied to tackle a long-term constrained source identification problem, validate the theoretical results and exhibit superior performance compared to existing methods.
arXiv Detail & Related papers (2023-11-04T15:08:36Z) - Synergies between Disentanglement and Sparsity: Generalization and
Identifiability in Multi-Task Learning [79.83792914684985]
We prove a new identifiability result that provides conditions under which maximally sparse base-predictors yield disentangled representations.
Motivated by this theoretical result, we propose a practical approach to learn disentangled representations based on a sparsity-promoting bi-level optimization problem.
arXiv Detail & Related papers (2022-11-26T21:02:09Z) - Minimax Optimal Quantile and Semi-Adversarial Regret via
Root-Logarithmic Regularizers [31.102181563065844]
Quantile (and, more generally, KL) regret bounds relax the goal of competing against the best individual expert to only competing against a majority of experts on adversarial data.
More recently, the semi-adversarial paradigm (Bilodeau, Negrea, and Roy 2020) provides an alternative relaxation of adversarial online learning by considering data that may be neither fully adversarial nor adversarial.
We achieve the minimax optimal regret in both paradigms using FTRL with separate, novel, root-logarithmic regularizers, both of which can be interpreted as yielding variants of NormalHedge.
arXiv Detail & Related papers (2021-10-27T22:38:52Z) - Nonasymptotic theory for two-layer neural networks: Beyond the
bias-variance trade-off [10.182922771556742]
We present a nonasymptotic generalization theory for two-layer neural networks with ReLU activation function.
We show that overparametrized random feature models suffer from the curse of dimensionality and thus are suboptimal.
arXiv Detail & Related papers (2021-06-09T03:52:18Z) - Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data
to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks.
In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings.
We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z) - Achieving robustness in classification using optimal transport with
hinge regularization [7.780418853571034]
We propose a new framework for binary classification, based on optimal transport.
We learn 1-Lipschitz networks using a new loss that is an hinge regularized version of the Kantorovich-Rubinstein dual formulation for the Wasserstein distance estimation.
arXiv Detail & Related papers (2020-06-11T15:36:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.