Related papers: Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning

Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning

URL: http://arxiv.org/abs/2201.09736v3
Date: Mon, 27 May 2024 19:58:52 GMT
Title: Tensor and Matrix Low-Rank Value-Function Approximation in Reinforcement Learning
Authors: Sergio Rozada, Santiago Paternain, Antonio G. Marques,
Abstract summary: Value-function approximation is a central problem in Reinforcement Learning (RL) This paper puts forth a parsimonious non-parametric approach, where we use low-rank algorithms to estimate the VF matrix in an online and model-free fashion. As VFs tend to be multi-dimensional, we propose replacing the classical VF matrix representation with a tensor representation and, then, use the PARAFAC decomposition to design an online model-free tensor low-rank algorithm.
Score: 11.317136648551536
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Value-function (VF) approximation is a central problem in Reinforcement Learning (RL). Classical non-parametric VF estimation suffers from the curse of dimensionality. As a result, parsimonious parametric models have been adopted to approximate VFs in high-dimensional spaces, with most efforts being focused on linear and neural-network-based approaches. Differently, this paper puts forth a a parsimonious non-parametric approach, where we use stochastic low-rank algorithms to estimate the VF matrix in an online and model-free fashion. Furthermore, as VFs tend to be multi-dimensional, we propose replacing the classical VF matrix representation with a tensor (multi-way array) representation and, then, use the PARAFAC decomposition to design an online model-free tensor low-rank algorithm. Different versions of the algorithms are proposed, their complexity is analyzed, and their performance is assessed numerically using standardized RL environments.

Related papers

Multi-Dimensional Visual Data Recovery: Scale-Aware Tensor Modeling and Accelerated Randomized Computation [51.65236537605077]
We propose a new type of network compression optimization technique, fully randomized tensor network compression (FCTN)<n>FCTN has significant advantages in correlation characterization and transpositional in algebra, and has notable achievements in multi-dimensional data processing and analysis.<n>We derive efficient algorithms with guarantees to solve the formulated models.
arXiv Detail & Related papers (2026-02-13T14:56:37Z)
ODELoRA: Training Low-Rank Adaptation by Solving Ordinary Differential Equations [54.886931928255564]
Low-rank adaptation (LoRA) has emerged as a widely adopted parameter-efficient fine-tuning method in deep transfer learning.<n>We propose a novel continuous-time optimization dynamic for LoRA factor matrices in the form of an ordinary differential equation (ODE)<n>We show that ODELoRA achieves stable feature learning, a property that is crucial for training deep neural networks at different scales of problem dimensionality.
arXiv Detail & Related papers (2026-02-07T10:19:36Z)
Efficient Approximation of Volterra Series for High-Dimensional Systems [0.0]
We introduce the Head Averaging (THA) algorithm, which significantly reduces complexity by constructing localized MVMALS models trained on small subsets of the input space.<n>THA offers a scalable and theoretically grounded approach for identifying previously intractable high-dimensional systems.
arXiv Detail & Related papers (2025-11-09T20:31:39Z)
Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various second-moment schemes. Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz of the gradient. Comprehensive experimental evaluations for or fine-tuning diverse FMs, including vision models, large language models, reinforcement learning models, generative adversarial networks, and recurrent neural networks, demonstrate its superior numerical performance compared to various state-of-the-art Directions.
arXiv Detail & Related papers (2025-02-15T12:28:51Z)
Regularized Projection Matrix Approximation with Applications to Community Detection [1.3761665705201904]
This paper introduces a regularized projection matrix approximation framework designed to recover cluster information from the affinity matrix. We investigate three distinct penalty functions, each specifically tailored to address bounded, positive, and sparse scenarios. Numerical experiments conducted on both synthetic and real-world datasets reveal that our regularized projection matrix approximation approach significantly outperforms state-of-the-art methods in clustering performance.
arXiv Detail & Related papers (2024-05-26T15:18:22Z)
Nonparametric Automatic Differentiation Variational Inference with Spline Approximation [7.5620760132717795]
We develop a nonparametric approximation approach that enables flexible posterior approximation for distributions with complicated structures. Compared with widely-used nonparametrical inference methods, the proposed method is easy to implement and adaptive to various data structures. Experiments demonstrate the efficiency of the proposed method in approximating complex posterior distributions and improving the performance of generative models with incomplete data.
arXiv Detail & Related papers (2024-03-10T20:22:06Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
Regularization and Variance-Weighted Regression Achieves Minimax Optimality in Linear MDPs: Theory and Practice [79.48432795639403]
Mirror descent value iteration (MDVI) is an abstraction of Kullback-Leibler (KL) and entropy-regularized reinforcement learning (RL) We study MDVI with linear function approximation through its sample complexity required to identify an $varepsilon$-optimal policy. We present Variance-Weighted Least-Squares MDVI, the first theoretical algorithm that achieves nearly minimax optimal sample complexity for infinite-horizon linear MDPs.
arXiv Detail & Related papers (2023-05-22T16:13:05Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Generalised Latent Assimilation in Heterogeneous Reduced Spaces with Machine Learning Surrogate Models [10.410970649045943]
We develop a system which combines reduced-order surrogate models with a novel data assimilation technique. Generalised Latent Assimilation can benefit both the efficiency provided by the reduced-order modelling and the accuracy of data assimilation.
arXiv Detail & Related papers (2022-04-07T15:13:12Z)
Low-rank State-action Value-function Approximation [11.026561518386025]
Several high-dimensional state problems can be well-approximated by an intrinsic low-rank structure. This paper proposes different algorithms to estimate a low-rank factorization of the $Q(s, a)$ matrix.
arXiv Detail & Related papers (2021-04-18T10:31:39Z)
Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals. We propose a general graph estimator based on a novel structured fusion regularization. We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z)
Understanding Implicit Regularization in Over-Parameterized Single Index Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model. We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
Multi-View Spectral Clustering Tailored Tensor Low-Rank Representation [105.33409035876691]
This paper explores the problem of multi-view spectral clustering (MVSC) based on tensor low-rank modeling. We design a novel structured tensor low-rank norm tailored to MVSC. We show that the proposed method outperforms state-of-the-art methods to a significant extent.
arXiv Detail & Related papers (2020-04-30T11:52:12Z)
Efficient Alternating Least Squares Algorithms for Low Multilinear Rank Approximation of Tensors [6.308492837096872]
We propose a new class of truncated HOSVD algorithms based on alternating least squares (ALS) for efficiently computing the low multilinear rank approximation of tensors. ALS-based approaches are able to eliminate the redundant computations of the singular vectors of intermediate matrices and are therefore free of data explosion. Numerical experiments with large-scale tensors from both synthetic and real-world applications demonstrate that ALS-based methods can substantially reduce the total cost of the original ones and are highly scalable for parallel computing.
arXiv Detail & Related papers (2020-04-06T11:58:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.