Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization
- URL: http://arxiv.org/abs/2511.01126v1
- Date: Mon, 03 Nov 2025 00:29:36 GMT
- Title: Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization
- Authors: Parvin Nazari, Bojian Hou, Davoud Ataee Tarzanagh, Li Shen, George Michailidis,
- Abstract summary: Online bilevel optimization (OBO) is a powerful framework for machine learning problems where both outer and inner objectives evolve over time.<n>Current OBO approaches rely on deterministic textitwindow-smoothed regret minimization, which may not accurately reflect system performance when functions change rapidly.<n>We introduce a novel search direction and show that both first- and zeroth-order (ZO) OBO algorithms leveraging this direction achieve sublinear stochastic bilevel regret without window smoothing.
- Score: 23.138958400600526
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online bilevel optimization (OBO) is a powerful framework for machine learning problems where both outer and inner objectives evolve over time, requiring dynamic updates. Current OBO approaches rely on deterministic \textit{window-smoothed} regret minimization, which may not accurately reflect system performance when functions change rapidly. In this work, we introduce a novel search direction and show that both first- and zeroth-order (ZO) stochastic OBO algorithms leveraging this direction achieve sublinear {stochastic bilevel regret without window smoothing}. Beyond these guarantees, our framework enhances efficiency by: (i) reducing oracle dependence in hypergradient estimation, (ii) updating inner and outer variables alongside the linear system solution, and (iii) employing ZO-based estimation of Hessians, Jacobians, and gradients. Experiments on online parametric loss tuning and black-box adversarial attacks validate our approach.
Related papers
- Non-Stationary Functional Bilevel Optimization [21.88474848102693]
Functional bilevel optimization (FBO) provides a powerful framework for hierarchical learning in function spaces.<n>We propose SmoothFBO, the first algorithm for non-stationary FBO with both theoretical guarantees and practical scalability.
arXiv Detail & Related papers (2026-01-21T14:35:23Z) - Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update [70.38810219913593]
We study the generalized linear bandit (GLB) problem, a contextual multi-armed bandit framework that extends the classical linear model by incorporating a non-linear link function.<n>GLBs are widely applicable to real-world scenarios, but their non-linear nature introduces significant challenges in achieving both computational and statistical efficiency.<n>We propose a jointly efficient algorithm that attains a nearly optimal regret bound with $mathcalO(1)$ time and space complexities per round.
arXiv Detail & Related papers (2025-07-16T02:24:21Z) - Online Decision-Focused Learning [74.3205104323777]
Decision-focused learning (DFL) is an increasingly popular paradigm for training models whose predictive outputs are used in decision-making tasks.<n>In this paper, we regularize the objective function to make it different and investigate how to overcome nonoptimality function.<n>We also showcase the effectiveness of our algorithms on a knapsack experiment, where they outperform two standard benchmarks.
arXiv Detail & Related papers (2025-05-19T10:40:30Z) - A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - Online Nonconvex Bilevel Optimization with Bregman Divergences [3.237380113935023]
We introduce an online bilevel (SOB) method for updating outer-level variables using an average of recent variance rates.
This approach is superior to the setting offline bilevel (OBO) as rates of hyperlevel benchmarks highlight the superior performance and efficiency.
arXiv Detail & Related papers (2024-09-16T17:01:27Z) - Non-Convex Bilevel Optimization with Time-Varying Objective Functions [57.299128109226025]
We propose an online bilevel optimization where the functions can be time-varying and the agent continuously updates the decisions with online data.
Compared to existing algorithms, SOBOW is computationally efficient and does not need to know previous functions.
We show that SOBOW can achieve a sublinear bilevel local regret under mild conditions.
arXiv Detail & Related papers (2023-08-07T06:27:57Z) - Meta-Learning Adversarial Bandit Algorithms [55.72892209124227]
We study online meta-learning with bandit feedback.
We learn to tune online mirror descent generalization (OMD) with self-concordant barrier regularizers.
arXiv Detail & Related papers (2023-07-05T13:52:10Z) - Meta-Learning Adversarial Bandits [49.094361442409785]
We study online learning with bandit feedback across multiple tasks, with the goal of improving average performance across tasks if they are similar according to some natural task-similarity measure.
As the first to target the adversarial setting, we design a meta-algorithm that setting-specific guarantees for two important cases: multi-armed bandits (MAB) and bandit optimization (BLO)
Our guarantees rely on proving that unregularized follow-the-leader combined with multiplicative weights is enough to online learn a non-smooth and non-B sequence.
arXiv Detail & Related papers (2022-05-27T17:40:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.