Combined Model for Partially-Observable and Non-Observable Task
Switching: Solving Hierarchical Reinforcement Learning Problems Statically
and Dynamically with Transfer Learning
- URL: http://arxiv.org/abs/2004.06213v2
- Date: Wed, 22 Apr 2020 18:45:51 GMT
- Title: Combined Model for Partially-Observable and Non-Observable Task
Switching: Solving Hierarchical Reinforcement Learning Problems Statically
and Dynamically with Transfer Learning
- Authors: Nibraas Khan and Joshua Phillips
- Abstract summary: Humans and animals rely on the interactions between the Pre-Frontal Cortex (PFC) and the Basal Ganglia (BG) to achieve this focus called Working Memory (WM)
Recent adaptations of the toolkit either utilize Abstract Task Representations (ATRs) to solve Non-Observable (NO) tasks or storage of past input features to solve Partially-Observable (PO) tasks, but not both.
We propose a new model, PONOWMtk, which combines both approaches, ATRs and input storage, with a static or dynamic number of ATRs.
- Score: 2.741266294612776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An integral function of fully autonomous robots and humans is the ability to
focus attention on a few relevant percepts to reach a certain goal while
disregarding irrelevant percepts. Humans and animals rely on the interactions
between the Pre-Frontal Cortex (PFC) and the Basal Ganglia (BG) to achieve this
focus called Working Memory (WM). The Working Memory Toolkit (WMtk) was
developed based on a computational neuroscience model of this phenomenon with
Temporal Difference (TD) Learning for autonomous systems. Recent adaptations of
the toolkit either utilize Abstract Task Representations (ATRs) to solve
Non-Observable (NO) tasks or storage of past input features to solve
Partially-Observable (PO) tasks, but not both. We propose a new model,
PONOWMtk, which combines both approaches, ATRs and input storage, with a static
or dynamic number of ATRs. The results of our experiments show that PONOWMtk
performs effectively for tasks that exhibit PO, NO, or both properties.
Related papers
- Sparse Multitask Learning for Efficient Neural Representation of Motor
Imagery and Execution [30.186917337606477]
We introduce a sparse multitask learning framework for motor imagery (MI) and motor execution (ME) tasks.
Given a dual-task CNN model for MI-ME classification, we apply a saliency-based sparsification approach to prune superfluous connections.
Our results indicate that this tailored sparsity can mitigate the overfitting problem and improve the test performance with small amount of data.
arXiv Detail & Related papers (2023-12-10T09:06:16Z) - Towards a Unified Transformer-based Framework for Scene Graph Generation
and Human-object Interaction Detection [116.21529970404653]
We introduce SG2HOI+, a unified one-step model based on the Transformer architecture.
Our approach employs two interactive hierarchical Transformers to seamlessly unify the tasks of SGG and HOI detection.
Our approach achieves competitive performance when compared to state-of-the-art HOI methods.
arXiv Detail & Related papers (2023-11-03T07:25:57Z) - Task-Aware Asynchronous Multi-Task Model with Class Incremental
Contrastive Learning for Surgical Scene Understanding [17.80234074699157]
A multi-task learning model is proposed for surgical report generation and tool-tissue interaction prediction.
The model forms of shared feature extractor, mesh-transformer branch for captioning and graph attention branch for tool-tissue interaction prediction.
We incorporate a task-aware asynchronous MTL optimization technique to fine-tune the shared weights and converge both tasks optimally.
arXiv Detail & Related papers (2022-11-28T14:08:48Z) - The impact of memory on learning sequence-to-sequence tasks [6.603326895384289]
Recent success of neural networks in natural language processing has drawn renewed attention to learning sequence-to-sequence (seq2seq) tasks.
We propose a model for a seq2seq task that has the advantage of providing explicit control over the degree of memory, or non-Markovianity, in the sequences.
arXiv Detail & Related papers (2022-05-29T14:57:33Z) - Continual Object Detection via Prototypical Task Correlation Guided
Gating Mechanism [120.1998866178014]
We present a flexible framework for continual object detection via pRotOtypical taSk corrElaTion guided gaTingAnism (ROSETTA)
Concretely, a unified framework is shared by all tasks while task-aware gates are introduced to automatically select sub-models for specific tasks.
Experiments on COCO-VOC, KITTI-Kitchen, class-incremental detection on VOC and sequential learning of four tasks show that ROSETTA yields state-of-the-art performance.
arXiv Detail & Related papers (2022-05-06T07:31:28Z) - SMEMO: Social Memory for Trajectory Forecasting [34.542209630734234]
We present a neural network based on an end-to-end trainable working memory, which acts as an external storage.
We show that our method is capable of learning explainable cause-effect relationships between motions of different agents, obtaining state-of-the-art results on trajectory forecasting datasets.
arXiv Detail & Related papers (2022-03-23T14:40:20Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Domain Adaptive Robotic Gesture Recognition with Unsupervised
Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot.
It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture.
Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z) - Multi-task Supervised Learning via Cross-learning [102.64082402388192]
We consider a problem known as multi-task learning, consisting of fitting a set of regression functions intended for solving different tasks.
In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other.
This facilitates cross-fertilization in which data collected across different domains help improving the learning performance at each other task.
arXiv Detail & Related papers (2020-10-24T21:35:57Z) - Efficient Inference of Flexible Interaction in Spiking-neuron Networks [41.83710212492543]
We use the nonlinear Hawkes process to model excitatory or inhibitory interactions among neurons.
We show our algorithm can estimate the temporal dynamics of interaction and reveal the interpretable functional connectivity underlying neural spike trains.
arXiv Detail & Related papers (2020-06-23T09:10:30Z) - Task-Feature Collaborative Learning with Application to Personalized
Attribute Prediction [166.87111665908333]
We propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL)
Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks.
As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks.
arXiv Detail & Related papers (2020-04-29T02:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.