Hierarchical Decision Transformer
- URL: http://arxiv.org/abs/2209.10447v1
- Date: Wed, 21 Sep 2022 15:48:40 GMT
- Title: Hierarchical Decision Transformer
- Authors: Andr\'e Correia and Lu\'is A. Alexandre
- Abstract summary: This paper presents a hierarchical algorithm for learning a sequence model from demonstrations.
The high-level mechanism guides the low-level controller through the task by selecting sub-goals for the latter to reach.
We validate our method in multiple tasks of OpenAIGym, D4RL and RoboMimic benchmarks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Sequence models in reinforcement learning require task knowledge to estimate
the task policy. This paper presents a hierarchical algorithm for learning a
sequence model from demonstrations. The high-level mechanism guides the
low-level controller through the task by selecting sub-goals for the latter to
reach. This sequence replaces the returns-to-go of previous methods, improving
its performance overall, especially in tasks with longer episodes and scarcer
rewards. We validate our method in multiple tasks of OpenAIGym, D4RL and
RoboMimic benchmarks. Our method outperforms the baselines in eight out of ten
tasks of varied horizons and reward frequencies without prior task knowledge,
showing the advantages of the hierarchical model approach for learning from
demonstrations using a sequence model.
Related papers
- KOPPA: Improving Prompt-based Continual Learning with Key-Query Orthogonal Projection and Prototype-based One-Versus-All [24.50129285997307]
We introduce a novel key-query learning strategy to enhance prompt matching efficiency and address the challenge of shifting features.
Our method empowers the model to achieve results surpassing those of current state-of-the-art approaches by a large margin of up to 20%.
arXiv Detail & Related papers (2023-11-26T20:35:19Z) - AdaMerging: Adaptive Model Merging for Multi-Task Learning [68.75885518081357]
This paper introduces an innovative technique called Adaptive Model Merging (AdaMerging)
It aims to autonomously learn the coefficients for model merging, either in a task-wise or layer-wise manner, without relying on the original training data.
Compared to the current state-of-the-art task arithmetic merging scheme, AdaMerging showcases a remarkable 11% improvement in performance.
arXiv Detail & Related papers (2023-10-04T04:26:33Z) - One-Shot Pruning for Fast-adapting Pre-trained Models on Devices [28.696989086706186]
Large-scale pre-trained models have been remarkably successful in resolving downstream tasks.
deploying these models on low-capability devices still requires an effective approach, such as model pruning.
We present a scalable one-shot pruning method that leverages pruned knowledge of similar tasks to extract a sub-network from the pre-trained model for a new task.
arXiv Detail & Related papers (2023-07-10T06:44:47Z) - Voting from Nearest Tasks: Meta-Vote Pruning of Pre-trained Models for
Downstream Tasks [55.431048995662714]
We create a small model for a new task from the pruned models of similar tasks.
We show that a few fine-tuning steps on this model suffice to produce a promising pruned-model for the new task.
We develop a simple but effective ''Meta-Vote Pruning (MVP)'' method that significantly reduces the pruning iterations for a new task.
arXiv Detail & Related papers (2023-01-27T06:49:47Z) - Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers [6.744385328015561]
Task-oriented dialogue systems, such as Apple Siri and Amazon Alexa, require a semantic parsing module in order to process user utterances and understand the action to be performed.
This semantic parsing component was initially implemented by rule-based or statistical slot-filling approaches for processing simple queries.
In this article, we advance the research on neural-reduce semantic parsing for task-oriented dialogue.
arXiv Detail & Related papers (2022-10-21T14:19:47Z) - Task Adaptive Parameter Sharing for Multi-Task Learning [114.80350786535952]
Adaptive Task Adapting Sharing (TAPS) is a method for tuning a base model to a new task by adaptively modifying a small, task-specific subset of layers.
Compared to other methods, TAPS retains high accuracy on downstream tasks while introducing few task-specific parameters.
We evaluate our method on a suite of fine-tuning tasks and architectures (ResNet, DenseNet, ViT) and show that it achieves state-of-the-art performance while being simple to implement.
arXiv Detail & Related papers (2022-03-30T23:16:07Z) - Hierarchical Learning for Generation with Long Source Sequences [4.851392124435261]
We design and study a new Hierarchical Attention Transformer-based architecture (HAT) that outperforms standard Transformers on several sequence to sequence tasks.
Our model achieves stateof-the-art results on four summarization tasks, including ArXiv, CNN/DM, SAMSum, and AMI, and we push PubMed R1 & R2 SOTA further.
arXiv Detail & Related papers (2021-04-15T15:57:32Z) - Few-shot Sequence Learning with Transformers [79.87875859408955]
Few-shot algorithms aim at learning new tasks provided only a handful of training examples.
In this work we investigate few-shot learning in the setting where the data points are sequences of tokens.
We propose an efficient learning algorithm based on Transformers.
arXiv Detail & Related papers (2020-12-17T12:30:38Z) - Model-based Adversarial Meta-Reinforcement Learning [38.28304764312512]
We propose Model-based Adversarial Meta-Reinforcement Learning (AdMRL)
AdMRL aims to minimize the worst-case sub-optimality gap across all tasks in a family of tasks.
We evaluate our approach on several continuous control benchmarks and demonstrate its efficacy in the worst-case performance over all tasks.
arXiv Detail & Related papers (2020-06-16T02:21:49Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z) - Hierarchical Reinforcement Learning as a Model of Human Task
Interleaving [60.95424607008241]
We develop a hierarchical model of supervisory control driven by reinforcement learning.
The model reproduces known empirical effects of task interleaving.
The results support hierarchical RL as a plausible model of task interleaving.
arXiv Detail & Related papers (2020-01-04T17:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.