Learning to Relax: Setting Solver Parameters Across a Sequence of Linear System Instances
- URL: http://arxiv.org/abs/2310.02246v2
- Date: Thu, 2 May 2024 04:59:18 GMT
- Title: Learning to Relax: Setting Solver Parameters Across a Sequence of Linear System Instances
- Authors: Mikhail Khodak, Edmond Chow, Maria-Florina Balcan, Ameet Talwalkar,
- Abstract summary: We show that a bandit online learning algorithm can select parameters for a sequence of instances such that the overall cost approaches that of the best fixed $omega$ as the sequence length increases.
Our work provides the first learning-theoretic treatment of high-precision linear system solvers and the first end-to-end guarantees for data-driven scientific computing.
- Score: 42.16343861129583
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving a linear system $Ax=b$ is a fundamental scientific computing primitive for which numerous solvers and preconditioners have been developed. These come with parameters whose optimal values depend on the system being solved and are often impossible or too expensive to identify; thus in practice sub-optimal heuristics are used. We consider the common setting in which many related linear systems need to be solved, e.g. during a single numerical simulation. In this scenario, can we sequentially choose parameters that attain a near-optimal overall number of iterations, without extra matrix computations? We answer in the affirmative for Successive Over-Relaxation (SOR), a standard solver whose parameter $\omega$ has a strong impact on its runtime. For this method, we prove that a bandit online learning algorithm--using only the number of iterations as feedback--can select parameters for a sequence of instances such that the overall cost approaches that of the best fixed $\omega$ as the sequence length increases. Furthermore, when given additional structural information, we show that a contextual bandit method asymptotically achieves the performance of the instance-optimal policy, which selects the best $\omega$ for each instance. Our work provides the first learning-theoretic treatment of high-precision linear system solvers and the first end-to-end guarantees for data-driven scientific computing, demonstrating theoretically the potential to speed up numerical methods using well-understood learning algorithms.
Related papers
- Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems [9.30306458153248]
We consider the spectral tail condition number, $kappa_ell$, defined as the ratio between the $ell$th largest and the smallest singular value of the matrix representing the system.
Some of the implications of our result, and of the use of $kappa_ell$, include direct improvement over a fine-grained analysis of the Conjugate method.
arXiv Detail & Related papers (2024-05-09T14:56:49Z) - OKRidge: Scalable Optimal k-Sparse Ridge Regression [21.17964202317435]
We propose a fast algorithm, OKRidge, for sparse ridge regression.
We also propose a method to warm-start our solver, which leverages a beam search.
arXiv Detail & Related papers (2023-04-13T17:34:44Z) - Refined Regret for Adversarial MDPs with Linear Function Approximation [50.00022394876222]
We consider learning in an adversarial Decision Process (MDP) where the loss functions can change arbitrarily over $K$ episodes.
This paper provides two algorithms that improve the regret to $tildemathcal O(K2/3)$ in the same setting.
arXiv Detail & Related papers (2023-01-30T14:37:21Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - Data-Driven Reachability Analysis from Noisy Data [5.949143454441281]
We consider the problem of computing reachable sets directly from noisy data without a given system model.
Several reachability algorithms are presented, and their accuracy is shown to depend on the underlying system generating the data.
arXiv Detail & Related papers (2021-05-15T14:11:57Z) - Algorithmic Solution for Systems of Linear Equations, in
$\mathcal{O}(mn)$ time [0.0]
We present a novel algorithm attaining excessively fast, the sought solution of linear systems of equations.
The execution time is very short compared with state-of-the-art methods.
The paper also comprises a theoretical proof for the algorithmic convergence.
arXiv Detail & Related papers (2021-04-26T13:40:31Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z) - Efficient Optimistic Exploration in Linear-Quadratic Regulators via
Lagrangian Relaxation [107.06364966905821]
We study the exploration-exploitation dilemma in the linear quadratic regulator (LQR) setting.
Inspired by the extended value iteration algorithm used in optimistic algorithms for finite MDPs, we propose to relax the optimistic optimization of ofulq.
We show that an $epsilon$-optimistic controller can be computed efficiently by solving at most $Obig(log (1/epsilon)big)$ Riccati equations.
arXiv Detail & Related papers (2020-07-13T16:30:47Z) - Symbolic Regression using Mixed-Integer Nonlinear Optimization [9.638685454900047]
The Symbolic Regression (SR) problem is a hard problem in machine learning.
We propose a hybrid algorithm that combines mixed-integer nonlinear optimization with explicit enumeration.
We show that our algorithm is competitive, for some synthetic data sets, with a state-of-the-art SR software and a recent physics-inspired method called AI Feynman.
arXiv Detail & Related papers (2020-06-11T20:53:17Z) - Learning nonlinear dynamical systems from a single trajectory [102.60042167341956]
We introduce algorithms for learning nonlinear dynamical systems of the form $x_t+1=sigma(Thetastarx_t)+varepsilon_t$.
We give an algorithm that recovers the weight matrix $Thetastar$ from a single trajectory with optimal sample complexity and linear running time.
arXiv Detail & Related papers (2020-04-30T10:42:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.