Generalisation in Multitask Fitted Q-Iteration and Offline Q-learning
- URL: http://arxiv.org/abs/2512.20220v1
- Date: Tue, 23 Dec 2025 10:20:11 GMT
- Title: Generalisation in Multitask Fitted Q-Iteration and Offline Q-learning
- Authors: Kausthubh Manda, Raghuram Bharadwaj Diddigi,
- Abstract summary: We study offline multitask reinforcement learning in settings where multiple tasks share a low-rank representation of their action-value functions.<n>We analyze a multitask variant of fitted Q-iteration that jointly learns a shared representation and task-specific value functions.<n>Our results clarify the role of shared representations in multitask offline Q-learning and provide theoretical insight into when and how multitask structure can improve generalization.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study offline multitask reinforcement learning in settings where multiple tasks share a low-rank representation of their action-value functions. In this regime, a learner is provided with fixed datasets collected from several related tasks, without access to further online interaction, and seeks to exploit shared structure to improve statistical efficiency and generalization. We analyze a multitask variant of fitted Q-iteration that jointly learns a shared representation and task-specific value functions via Bellman error minimization on offline data. Under standard realizability and coverage assumptions commonly used in offline reinforcement learning, we establish finite-sample generalization guarantees for the learned value functions. Our analysis explicitly characterizes how pooling data across tasks improves estimation accuracy, yielding a $1/\sqrt{nT}$ dependence on the total number of samples across tasks, while retaining the usual dependence on the horizon and concentrability coefficients arising from distribution shift. In addition, we consider a downstream offline setting in which a new task shares the same underlying representation as the upstream tasks. We study how reusing the representation learned during the multitask phase affects value estimation for this new task, and show that it can reduce the effective complexity of downstream learning relative to learning from scratch. Together, our results clarify the role of shared representations in multitask offline Q-learning and provide theoretical insight into when and how multitask structure can improve generalization in model-free, value-based reinforcement learning.
Related papers
- Dynamic Routing Between Experts: A Data-Efficient Approach to Continual Learning in Vision-Language Models [10.431923437214719]
Vision-Language Models (VLMs) suffer from catastrophic forgetting when sequentially fine-tuned on new tasks.<n>We introduce a routing-based approach that enables the integration of new tasks while preserving the foundational knowledge acquired during pretraining.
arXiv Detail & Related papers (2025-11-03T18:39:32Z) - The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedback [12.388205905012423]
Reinforcement learning from human feedback has contributed to performance improvements in large language models.<n>We formulate RLHF as the contextual dueling bandit problem and assume a common linear representation.<n>We prove that to achieve $varepsilon-$optimal, the sample complexity of the source tasks can be significantly reduced.
arXiv Detail & Related papers (2024-05-18T08:29:15Z) - Distribution Matching for Multi-Task Learning of Classification Tasks: a
Large-Scale Study on Faces & Beyond [62.406687088097605]
Multi-Task Learning (MTL) is a framework, where multiple related tasks are learned jointly and benefit from a shared representation space.
We show that MTL can be successful with classification tasks with little, or non-overlapping annotations.
We propose a novel approach, where knowledge exchange is enabled between the tasks via distribution matching.
arXiv Detail & Related papers (2024-01-02T14:18:11Z) - Generalizable Task Representation Learning for Offline
Meta-Reinforcement Learning with Data Limitations [22.23114883485924]
We propose a novel algorithm called GENTLE for learning generalizable task representations in the face of data limitations.
GENTLE employs Task Auto-Encoder(TAE), which is an encoder-decoder architecture to extract the characteristics of the tasks.
To alleviate the effect of limited behavior diversity, we construct pseudo-transitions to align the data distribution used to train TAE with the data distribution encountered during testing.
arXiv Detail & Related papers (2023-12-26T07:02:12Z) - Leveraging sparse and shared feature activations for disentangled
representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation.
We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z) - Composite Learning for Robust and Effective Dense Predictions [81.2055761433725]
Multi-task learning promises better model generalization on a target task by jointly optimizing it with an auxiliary task.
We find that jointly training a dense prediction (target) task with a self-supervised (auxiliary) task can consistently improve the performance of the target task, while eliminating the need for labeling auxiliary tasks.
arXiv Detail & Related papers (2022-10-13T17:59:16Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - Understanding and Improving Information Transfer in Multi-Task Learning [14.43111978531182]
We study an architecture with a shared module for all tasks and a separate output module for each task.
We show that misalignment between task data can cause negative transfer (or hurt performance) and provide sufficient conditions for positive transfer.
Inspired by the theoretical insights, we show that aligning tasks' embedding layers leads to performance gains for multi-task training and transfer learning.
arXiv Detail & Related papers (2020-05-02T23:43:52Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.