Meta Learning MDPs with Linear Transition Models
- URL: http://arxiv.org/abs/2201.08732v1
- Date: Fri, 21 Jan 2022 14:57:03 GMT
- Title: Meta Learning MDPs with Linear Transition Models
- Authors: Robert M\"uller and Aldo Pacchiano
- Abstract summary: We study meta-learning in Markov Decision Processes (MDP) with linear transition models in the undiscounted episodic setting.
We propose BUC-MatrixRL, a version of the UC-Matrix RL algorithm, and show it can meaningfully leverage a set of sampled training tasks.
We prove that compared to learning the tasks in isolation, BUC-Matrix RL provides significant improvements in the transfer regret for high bias low variance task distributions.
- Score: 22.508479528847634
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study meta-learning in Markov Decision Processes (MDP) with linear
transition models in the undiscounted episodic setting. Under a task sharedness
metric based on model proximity we study task families characterized by a
distribution over models specified by a bias term and a variance component. We
then propose BUC-MatrixRL, a version of the UC-Matrix RL algorithm, and show it
can meaningfully leverage a set of sampled training tasks to quickly solve a
test task sampled from the same task distribution by learning an estimator of
the bias parameter of the task distribution. The analysis leverages and extends
results in the learning to learn linear regression and linear bandit setting to
the more general case of MDP's with linear transition models. We prove that
compared to learning the tasks in isolation, BUC-Matrix RL provides significant
improvements in the transfer regret for high bias low variance task
distributions.
Related papers
- Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression? [92.90857135952231]
Transformers pretrained on diverse tasks exhibit remarkable in-context learning (ICL) capabilities.
We study ICL in one of its simplest setups: pretraining a linearly parameterized single-layer linear attention model for linear regression.
arXiv Detail & Related papers (2023-10-12T15:01:43Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - Reinforcement Learning in the Wild with Maximum Likelihood-based Model
Transfer [5.92353064090273]
We study the problem of transferring the available Markov Decision Process (MDP) models to learn and plan efficiently in an unknown but similar MDP.
We propose a generic two-stage algorithm, MLEMTRL, to address the MTRL problem in discrete and continuous settings.
We empirically demonstrate that MLEMTRL allows faster learning in new MDPs than learning from scratch and achieves near-optimal performance.
arXiv Detail & Related papers (2023-02-18T09:47:34Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - The Common Intuition to Transfer Learning Can Win or Lose: Case Studies for Linear Regression [26.5147705530439]
We define a transfer learning approach to the target task as a linear regression optimization with a regularization on the distance between the to-be-learned target parameters and the already-learned source parameters.
We show that for sufficiently related tasks, the optimally tuned transfer learning approach can outperform the optimally tuned ridge regression method.
arXiv Detail & Related papers (2021-03-09T18:46:01Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z) - PAC-Bayes meta-learning with implicit task-specific posteriors [37.32107678838193]
We introduce a new and rigorously-formulated PAC-Bayes meta-learning algorithm that solves few-shot learning.
We show that the models trained with our proposed meta-learning algorithm are well calibrated and accurate.
arXiv Detail & Related papers (2020-03-05T06:56:19Z) - Plannable Approximations to MDP Homomorphisms: Equivariance under
Actions [72.30921397899684]
We introduce a contrastive loss function that enforces action equivariance on the learned representations.
We prove that when our loss is zero, we have a homomorphism of a deterministic Markov Decision Process.
We show experimentally that for deterministic MDPs, the optimal policy in the abstract MDP can be successfully lifted to the original MDP.
arXiv Detail & Related papers (2020-02-27T08:29:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.