Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
- URL: http://arxiv.org/abs/2505.23150v1
- Date: Thu, 29 May 2025 06:41:45 GMT
- Title: Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners
- Authors: Michal Nauman, Marek Cygan, Carmelo Sferrazza, Aviral Kumar, Pieter Abbeel,
- Abstract summary: We show that the use of high-capacity value models trained via cross-entropy and conditioned on learnable task embeddings addresses the problem of task interference in online reinforcement learning.<n>We test our approach on 7 multi-task benchmarks with over 280 unique tasks, spanning high degree-of-freedom humanoid control and discrete vision-based RL.
- Score: 60.75160178669076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in language modeling and vision stem from training large models on diverse, multi-task data. This paradigm has had limited impact in value-based reinforcement learning (RL), where improvements are often driven by small models trained in a single-task context. This is because in multi-task RL sparse rewards and gradient conflicts make optimization of temporal difference brittle. Practical workflows for generalist policies therefore avoid online training, instead cloning expert trajectories or distilling collections of single-task policies into one agent. In this work, we show that the use of high-capacity value models trained via cross-entropy and conditioned on learnable task embeddings addresses the problem of task interference in online RL, allowing for robust and scalable multi-task training. We test our approach on 7 multi-task benchmarks with over 280 unique tasks, spanning high degree-of-freedom humanoid control and discrete vision-based RL. We find that, despite its simplicity, the proposed approach leads to state-of-the-art single and multi-task performance, as well as sample-efficient transfer to new tasks.
Related papers
- Optimizing Dense Visual Predictions Through Multi-Task Coherence and Prioritization [7.776434991976473]
Multi-Task Learning (MTL) involves the concurrent training of multiple tasks.<n>We propose an advanced MTL model specifically designed for dense vision tasks.
arXiv Detail & Related papers (2024-12-04T10:05:47Z) - On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion [23.63688816017186]
Existing weak-to-strong methods often employ a static knowledge transfer ratio and a single small model for transferring complex knowledge.
We propose a dynamic logit fusion approach that works with a series of task-specific small models, each specialized in a different task.
Our method closes the performance gap by 96.4% in single-task scenarios and by 86.3% in multi-task scenarios.
arXiv Detail & Related papers (2024-06-17T03:07:41Z) - Effective Adaptation in Multi-Task Co-Training for Unified Autonomous
Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks.
We find that their performances are sub-optimal or even lag far behind the single-task baseline.
We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z) - DL-DRL: A double-level deep reinforcement learning approach for
large-scale task scheduling of multi-UAV [65.07776277630228]
We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF)
Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs.
We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
arXiv Detail & Related papers (2022-08-04T04:35:53Z) - An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale
Multitask Learning Systems [4.675744559395732]
Multitask learning assumes that models capable of learning from multiple tasks can achieve better quality and efficiency via knowledge transfer.
State of the art ML models rely on high customization for each task and leverage size and data scale rather than scaling the number of tasks.
We propose an evolutionary method that can generate a large scale multitask model and can support the dynamic and continuous addition of new tasks.
arXiv Detail & Related papers (2022-05-25T13:10:47Z) - In Defense of the Unitary Scalarization for Deep Multi-Task Learning [121.76421174107463]
We present a theoretical analysis suggesting that many specialized multi-tasks can be interpreted as forms of regularization.
We show that, when coupled with standard regularization and stabilization techniques, unitary scalarization matches or improves upon the performance of complex multitasks.
arXiv Detail & Related papers (2022-01-11T18:44:17Z) - Reparameterizing Convolutions for Incremental Multi-Task Learning
without Task Interference [75.95287293847697]
Two common challenges in developing multi-task models are often overlooked in literature.
First, enabling the model to be inherently incremental, continuously incorporating information from new tasks without forgetting the previously learned ones (incremental learning)
Second, eliminating adverse interactions amongst tasks, which has been shown to significantly degrade the single-task performance in a multi-task setup (task interference)
arXiv Detail & Related papers (2020-07-24T14:44:46Z) - Gradient Surgery for Multi-Task Learning [119.675492088251]
Multi-task learning has emerged as a promising approach for sharing structure across multiple tasks.
The reasons why multi-task learning is so challenging compared to single-task learning are not fully understood.
We propose a form of gradient surgery that projects a task's gradient onto the normal plane of the gradient of any other task that has a conflicting gradient.
arXiv Detail & Related papers (2020-01-19T06:33:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.