Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing
- URL: http://arxiv.org/abs/2105.08285v1
- Date: Tue, 18 May 2021 05:23:53 GMT
- Title: Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing
- Authors: Anshumali Shrivastava, Zhao Song, Zhaozhuo Xu
- Abstract summary: We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions.
We build the connections between the theory of approximate maximum inner product search and the regret analysis of reinforcement learning.
- Score: 49.73889315176884
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present the first provable Least-Squares Value Iteration (LSVI) algorithms
that have runtime complexity sublinear in the number of actions. We formulate
the value function estimation procedure in value iteration as an approximate
maximum inner product search problem and propose a locality sensitive hashing
(LSH) [Indyk and Motwani STOC'98, Andoni and Razenshteyn STOC'15, Andoni,
Laarhoven, Razenshteyn and Waingarten SODA'17] type data structure to solve
this problem with sublinear time complexity. Moreover, we build the connections
between the theory of approximate maximum inner product search and the regret
analysis of reinforcement learning. We prove that, with our choice of
approximation factor, our Sublinear LSVI algorithms maintain the same regret as
the original LSVI algorithms while reducing the runtime complexity to sublinear
in the number of actions. To the best of our knowledge, this is the first work
that combines LSH with reinforcement learning resulting in provable
improvements. We hope that our novel way of combining data-structures and
iterative algorithm will open the door for further study into cost reduction in
optimization.
Related papers
- A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval [56.67706781191521]
In this work, we present a robust phase retrieval problem where the task is to recover an unknown signal.
Our proposed oracle avoids the need for computationally spectral descent, using a simple gradient step and outliers.
arXiv Detail & Related papers (2024-09-07T06:37:23Z) - Making RL with Preference-based Feedback Efficient via Randomization [11.019088464664696]
Reinforcement Learning algorithms that learn from human feedback need to be efficient in terms of statistical complexity, computational complexity, and query complexity.
We present an algorithm that is sample efficient (i.e. has near-optimal-case regret bounds) and has running time (i.e. computational complexity is worst with respect to relevant parameters)
To extend the results to more general nonlinear function approximation, we design a model-based randomized algorithm inspired by the idea of Thompson sampling.
arXiv Detail & Related papers (2023-10-23T04:19:35Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Ordering for Non-Replacement SGD [7.11967773739707]
We seek to find an ordering that can improve the convergence rates for the non-replacement form of the algorithm.
We develop optimal orderings for constant and decreasing step sizes for strongly convex and convex functions.
In addition, we are able to combine the ordering with mini-batch and further apply it to more complex neural networks.
arXiv Detail & Related papers (2023-06-28T00:46:58Z) - Recovering Simultaneously Structured Data via Non-Convex Iteratively
Reweighted Least Squares [0.8702432681310401]
We propose a new algorithm for recovering data that adheres to multiple, heterogeneous low-dimensional structures from linear observations.
We show that the IRLS method favorable in identifying low/comckuele state measurements.
arXiv Detail & Related papers (2023-06-08T06:35:47Z) - A Recursively Recurrent Neural Network (R2N2) Architecture for Learning
Iterative Algorithms [64.3064050603721]
We generalize Runge-Kutta neural network to a recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms.
We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields similar iterations to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta solvers for ordinary differential equations.
arXiv Detail & Related papers (2022-11-22T16:30:33Z) - Linear regression with partially mismatched data: local search with
theoretical guarantees [9.398989897176953]
We study an important variant of linear regression in which the predictor-response pairs are partially mismatched.
We use an optimization formulation to simultaneously learn the underlying regression coefficients and the permutation corresponding to the mismatches.
We prove that our local search algorithm converges to a nearly-optimal solution at a linear rate.
arXiv Detail & Related papers (2021-06-03T23:32:12Z) - Exploiting Higher Order Smoothness in Derivative-free Optimization and
Continuous Bandits [99.70167985955352]
We study the problem of zero-order optimization of a strongly convex function.
We consider a randomized approximation of the projected gradient descent algorithm.
Our results imply that the zero-order algorithm is nearly optimal in terms of sample complexity and the problem parameters.
arXiv Detail & Related papers (2020-06-14T10:42:23Z) - A Robust Matching Pursuit Algorithm Using Information Theoretic Learning [37.968665739578185]
A new OMP algorithm is developed based on the information theoretic learning (ITL)
The experimental results on both simulated and real-world data consistently demonstrate the superiority of the proposed OMP algorithm in data recovery, image reconstruction, and classification.
arXiv Detail & Related papers (2020-05-10T01:36:00Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.