UniMASK: Unified Inference in Sequential Decision Problems
- URL: http://arxiv.org/abs/2211.10869v1
- Date: Sun, 20 Nov 2022 04:54:49 GMT
- Title: UniMASK: Unified Inference in Sequential Decision Problems
- Authors: Micah Carroll, Orr Paradise, Jessy Lin, Raluca Georgescu, Mingfei Sun,
David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca
Dragan, Sam Devlin
- Abstract summary: We introduce the UniMASK framework, which provides a unified way to specify models which can be trained on many different sequential decision-making tasks.
A single UniMASK model is often capable of carrying out many tasks with performance similar to or better than single-task models.
- Score: 17.09745648221254
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Randomly masking and predicting word tokens has been a successful approach in
pre-training language models for a variety of downstream tasks. In this work,
we observe that the same idea also applies naturally to sequential
decision-making, where many well-studied tasks like behavior cloning, offline
reinforcement learning, inverse dynamics, and waypoint conditioning correspond
to different sequence maskings over a sequence of states, actions, and returns.
We introduce the UniMASK framework, which provides a unified way to specify
models which can be trained on many different sequential decision-making tasks.
We show that a single UniMASK model is often capable of carrying out many tasks
with performance similar to or better than single-task models. Additionally,
after fine-tuning, our UniMASK models consistently outperform comparable
single-task models. Our code is publicly available at
https://github.com/micahcarroll/uniMASK.
Related papers
- Learning to Decode Collaboratively with Multiple Language Models [37.31339648499042]
We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level.
Token-level collaboration during decoding allows for a fusion of each model's expertise in a manner tailored to the specific task at hand.
arXiv Detail & Related papers (2024-03-06T17:23:28Z) - SplAgger: Split Aggregation for Meta-Reinforcement Learning [32.25672143072966]
Black box methods do so by training off-the-shelf sequence models end-to-end.
task inference methods explicitly infer a posterior distribution over the unknown task.
Recent work has shown that task inference sequence models are not necessary for strong performance.
We present evidence that task inference sequence models are indeed still beneficial.
arXiv Detail & Related papers (2024-03-05T14:57:04Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - Testing the Limits of Unified Sequence to Sequence LLM Pretraining on
Diverse Table Data Tasks [2.690048852269647]
We study the advantages of a unified approach to table specific pretraining when scaled from 770M to 11B sequence to sequence models.
Our work is the first attempt at studying the advantages of a unified approach to table specific pretraining when scaled from 770M to 11B sequence to sequence models.
arXiv Detail & Related papers (2023-10-01T21:06:15Z) - Meta-training with Demonstration Retrieval for Efficient Few-shot
Learning [11.723856248352007]
Large language models show impressive results on few-shot NLP tasks.
These models are memory and computation-intensive.
We propose meta-training with demonstration retrieval.
arXiv Detail & Related papers (2023-06-30T20:16:22Z) - RetICL: Sequential Retrieval of In-Context Examples with Reinforcement Learning [53.52699766206808]
We propose Retrieval for In-Context Learning (RetICL), a learnable method for modeling and optimally selecting examples sequentially for in-context learning.
We evaluate RetICL on math word problem solving and scientific question answering tasks and show that it consistently outperforms or matches and learnable baselines.
arXiv Detail & Related papers (2023-05-23T20:15:56Z) - OFASys: A Multi-Modal Multi-Task Learning System for Building Generalist
Models [72.8156832931841]
Generalist models are capable of performing diverse multi-modal tasks in a task-agnostic way within a single model.
We release a generalist model learning system, OFASys, built on top of a declarative task interface named multi-modal instruction.
arXiv Detail & Related papers (2022-12-08T17:07:09Z) - Masked Autoencoding for Scalable and Generalizable Decision Making [93.84855114717062]
MaskDP is a simple and scalable self-supervised pretraining method for reinforcement learning and behavioral cloning.
We find that a MaskDP model gains the capability of zero-shot transfer to new BC tasks, such as single and multiple goal reaching.
arXiv Detail & Related papers (2022-11-23T07:04:41Z) - Towards Flexible Inference in Sequential Decision Problems via
Bidirectional Transformers [17.09745648221254]
We introduce the FlexiBiT framework, which provides a unified way to specify models which can be trained on many different sequential decision making tasks.
A single FlexiBiT model is simultaneously capable of carrying out many tasks with performance similar to or better than specialized models.
arXiv Detail & Related papers (2022-04-28T07:50:08Z) - Train No Evil: Selective Masking for Task-Guided Pre-Training [97.03615486457065]
We propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning.
We show that our method can achieve comparable or even better performance with less than 50% of cost.
arXiv Detail & Related papers (2020-04-21T03:14:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.