A Memory-Related Multi-Task Method Based on Task-Agnostic Exploration
- URL: http://arxiv.org/abs/2209.04100v1
- Date: Fri, 9 Sep 2022 03:02:49 GMT
- Title: A Memory-Related Multi-Task Method Based on Task-Agnostic Exploration
- Authors: Xianqi Zhang, Xingtao Wang, Xu Liu, Xiaopeng Fan and Debin Zhao
- Abstract summary: In contrast to imitation learning, there is no expert data, only the data collected through environmental exploration.
Since the action sequence to solve the new task may be the combination of trajectory segments of multiple training tasks, the test task and the solving strategy do not exist directly in the training data.
We propose a Memory-related Multi-task Method (M3) to address this problem.
- Score: 26.17597857264231
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We pose a new question: Can agents learn how to combine actions from previous
tasks to complete new tasks, just as humans? In contrast to imitation learning,
there is no expert data, only the data collected through environmental
exploration. Compared with offline reinforcement learning, the problem of data
distribution shift is more serious. Since the action sequence to solve the new
task may be the combination of trajectory segments of multiple training tasks,
in other words, the test task and the solving strategy do not exist directly in
the training data. This makes the problem more difficult. We propose a
Memory-related Multi-task Method (M3) to address this problem. The method
consists of three stages. First, task-agnostic exploration is carried out to
collect data. Different from previous methods, we organize the exploration data
into a knowledge graph. We design a model based on the exploration data to
extract action effect features and save them in memory, while an action
predictive model is trained. Secondly, for a new task, the action effect
features stored in memory are used to generate candidate actions by a feature
decomposition-based approach. Finally, a multi-scale candidate action pool and
the action predictive model are fused to generate a strategy to complete the
task. Experimental results show that the performance of our proposed method is
significantly improved compared with the baseline.
Related papers
- Reducing catastrophic forgetting of incremental learning in the absence of rehearsal memory with task-specific token [0.6144680854063939]
Deep learning models display catastrophic forgetting when learning new data continuously.
We present a novel method that preserves previous knowledge without storing previous data.
This method is inspired by the architecture of a vision transformer and employs a unique token capable of encapsulating the compressed knowledge of each task.
arXiv Detail & Related papers (2024-11-06T16:13:50Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Divide and Conquer: Hybrid Pre-training for Person Search [40.13016375392472]
We propose a hybrid pre-training framework specifically designed for person search using sub-task data only.
Our model can achieve significant improvements across diverse protocols, such as person search method, fine-tuning data, pre-training data and model backbone.
Our code and pre-trained models are released for plug-and-play usage to the person search community.
arXiv Detail & Related papers (2023-12-13T08:33:50Z) - Behavior Retrieval: Few-Shot Imitation Learning by Querying Unlabeled
Datasets [73.2096288987301]
We propose a simple approach that uses a small amount of downstream expert data to selectively query relevant behaviors from an offline, unlabeled dataset.
We observe that our method learns to query only the relevant transitions to the task, filtering out sub-optimal or task-irrelevant data.
Our simple querying approach outperforms more complex goal-conditioned methods by 20% across simulated and real robotic manipulation tasks from images.
arXiv Detail & Related papers (2023-04-18T05:42:53Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Instance-Level Task Parameters: A Robust Multi-task Weighting Framework [17.639472693362926]
Recent works have shown that deep neural networks benefit from multi-task learning by learning a shared representation across several related tasks.
We let the training process dictate the optimal weighting of tasks for every instance in the dataset.
We conduct extensive experiments on SURREAL and CityScapes datasets, for human shape and pose estimation, depth estimation and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-11T02:35:42Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - COG: Connecting New Skills to Past Experience with Offline Reinforcement
Learning [78.13740204156858]
We show that we can reuse prior data to extend new skills simply through dynamic programming.
We demonstrate the effectiveness of our approach by chaining together several behaviors seen in prior datasets for solving a new task.
We train our policies in an end-to-end fashion, mapping high-dimensional image observations to low-level robot control commands.
arXiv Detail & Related papers (2020-10-27T17:57:29Z) - Generalized Hindsight for Reinforcement Learning [154.0545226284078]
We argue that low-reward data collected while trying to solve one task provides little to no signal for solving that particular task.
We present Generalized Hindsight: an approximate inverse reinforcement learning technique for relabeling behaviors with the right tasks.
arXiv Detail & Related papers (2020-02-26T18:57:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.