Related papers: Provable General Function Class Representation Learning in Multitask Bandits and MDPs

Provable General Function Class Representation Learning in Multitask Bandits and MDPs

URL: http://arxiv.org/abs/2205.15701v1
Date: Tue, 31 May 2022 11:36:42 GMT
Title: Provable General Function Class Representation Learning in Multitask Bandits and MDPs
Authors: Rui Lu, Andrew Zhao, Simon S. Du, Gao Huang
Abstract summary: multitask representation learning is a popular approach in reinforcement learning to boost the sample efficiency. In this work, we extend the analysis to general function class representations. We theoretically validate the benefit of multitask representation learning within general function class for bandits and linear MDP.
Score: 58.624124220900306
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While multitask representation learning has become a popular approach in reinforcement learning (RL) to boost the sample efficiency, the theoretical understanding of why and how it works is still limited. Most previous analytical works could only assume that the representation function is already known to the agent or from linear function class, since analyzing general function class representation encounters non-trivial technical obstacles such as generalization guarantee, formulation of confidence bound in abstract function space, etc. However, linear-case analysis heavily relies on the particularity of linear function class, while real-world practice usually adopts general non-linear representation functions like neural networks. This significantly reduces its applicability. In this work, we extend the analysis to general function class representations. Specifically, we consider an agent playing $M$ contextual bandits (or MDPs) concurrently and extracting a shared representation function $\phi$ from a specific function class $\Phi$ using our proposed Generalized Functional Upper Confidence Bound algorithm (GFUCB). We theoretically validate the benefit of multitask representation learning within general function class for bandits and linear MDP for the first time. Lastly, we conduct experiments to demonstrate the effectiveness of our algorithm with neural net representation.

Related papers

L1-Regularized Functional Support Vector Machine [0.0]
We propose an $L_1$-regularized functional support vector machine for binary classification.<n>An accompanying algorithm is developed to fit the classifier.<n> Numerical results from simulations and one real-world application demonstrate that the proposed classifier enjoys good performance in both prediction and feature selection.
arXiv Detail & Related papers (2025-08-07T17:00:29Z)
Outcome-Based Online Reinforcement Learning: Algorithms and Fundamental Limits [58.63897489864948]
Reinforcement learning with outcome-based feedback faces a fundamental challenge.<n>How do we assign credit to the right actions?<n>This paper provides the first comprehensive analysis of this problem in online RL with general function approximation.
arXiv Detail & Related papers (2025-05-26T17:44:08Z)
Towards Understanding the Benefit of Multitask Representation Learning in Decision Process [47.57497888077687]
Multitask Representation Learning (MRL) has emerged as a prevalent technique to improve sample efficiency in Reinforcement Learning (RL) Our work tries to fill the gap by extending the analysis to textitunknown non-linear representations, giving a comprehensive analysis for its mechanism in online and transfer learning setting. We formally prove that this approach yields a regret upper bound that outperforms the lower bound associated with learning $M$ separate tasks, marking the first demonstration of MRL's efficacy in a general function class.
arXiv Detail & Related papers (2025-03-01T04:29:22Z)
Tractable and Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation [8.378137704007038]
We present a regret analysis for distributional reinforcement learning with general value function approximation. Our theoretical results show that approximating the infinite-dimensional return distribution with a finite number of moment functionals is the only method to learn the statistical information unbiasedly.
arXiv Detail & Related papers (2024-07-31T00:43:51Z)
Tuning Pre-trained Model via Moment Probing [62.445281364055795]
We propose a novel Moment Probing (MP) method to explore the potential of LP. MP performs a linear classification head based on the mean of final features. Our MP significantly outperforms LP and is competitive with counterparts at less training cost.
arXiv Detail & Related papers (2023-07-21T04:15:02Z)
How Fine-Tuning Allows for Effective Meta-Learning [50.17896588738377]
We present a theoretical framework for analyzing representations derived from a MAML-like algorithm. We provide risk bounds on the best predictor found by fine-tuning via gradient descent, demonstrating that the algorithm can provably leverage the shared structure. This separation result underscores the benefit of fine-tuning-based methods, such as MAML, over methods with "frozen representation" objectives in few-shot learning.
arXiv Detail & Related papers (2021-05-05T17:56:00Z)
The Nonconvex Geometry of Linear Inverse Problems [26.049281826055797]
The gauge function measures the complexity of a statistical model. We introduce a new notion of statistical complexity, gauge$_p$ function, which overcomes the limitations of the gauge function. We propose a new learning machine, with the building block of gauge$_p$ function, and arm this machine with a number of statistical guarantees.
arXiv Detail & Related papers (2021-01-07T21:55:08Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)
UNIPoint: Universally Approximating Point Processes Intensities [125.08205865536577]
We provide a proof that a class of learnable functions can universally approximate any valid intensity function. We implement UNIPoint, a novel neural point process model, using recurrent neural networks to parameterise sums of basis function upon each event.
arXiv Detail & Related papers (2020-07-28T09:31:56Z)
Invariant Feature Coding using Tensor Product Representation [75.62232699377877]
We prove that the group-invariant feature vector contains sufficient discriminative information when learning a linear classifier. A novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering.
arXiv Detail & Related papers (2019-06-05T07:15:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.