Hierarchical Reinforcement Learning as a Model of Human Task
Interleaving
- URL: http://arxiv.org/abs/2001.02122v1
- Date: Sat, 4 Jan 2020 17:53:28 GMT
- Title: Hierarchical Reinforcement Learning as a Model of Human Task
Interleaving
- Authors: Christoph Gebhardt, Antti Oulasvirta, Otmar Hilliges
- Abstract summary: We develop a hierarchical model of supervisory control driven by reinforcement learning.
The model reproduces known empirical effects of task interleaving.
The results support hierarchical RL as a plausible model of task interleaving.
- Score: 60.95424607008241
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How do people decide how long to continue in a task, when to switch, and to
which other task? Understanding the mechanisms that underpin task interleaving
is a long-standing goal in the cognitive sciences. Prior work suggests greedy
heuristics and a policy maximizing the marginal rate of return. However, it is
unclear how such a strategy would allow for adaptation to everyday environments
that offer multiple tasks with complex switch costs and delayed rewards. Here
we develop a hierarchical model of supervisory control driven by reinforcement
learning (RL). The supervisory level learns to switch using task-specific
approximate utility estimates, which are computed on the lower level. A
hierarchically optimal value function decomposition can be learned from
experience, even in conditions with multiple tasks and arbitrary and uncertain
reward and cost structures. The model reproduces known empirical effects of
task interleaving. It yields better predictions of individual-level data than a
myopic baseline in a six-task problem (N=211). The results support hierarchical
RL as a plausible model of task interleaving.
Related papers
- POMRL: No-Regret Learning-to-Plan with Increasing Horizons [43.693739167594295]
We study the problem of planning under model uncertainty in an online meta-reinforcement learning setting.
We propose an algorithm to meta-learn the underlying structure across tasks, utilize it to plan in each task, and upper-bound the regret of the planning loss.
arXiv Detail & Related papers (2022-12-30T03:09:45Z) - An Evolutionary Approach to Dynamic Introduction of Tasks in Large-scale
Multitask Learning Systems [4.675744559395732]
Multitask learning assumes that models capable of learning from multiple tasks can achieve better quality and efficiency via knowledge transfer.
State of the art ML models rely on high customization for each task and leverage size and data scale rather than scaling the number of tasks.
We propose an evolutionary method that can generate a large scale multitask model and can support the dynamic and continuous addition of new tasks.
arXiv Detail & Related papers (2022-05-25T13:10:47Z) - New Tight Relaxations of Rank Minimization for Multi-Task Learning [161.23314844751556]
We propose two novel multi-task learning formulations based on two regularization terms.
We show that our methods can correctly recover the low-rank structure shared across tasks, and outperform related multi-task learning methods.
arXiv Detail & Related papers (2021-12-09T07:29:57Z) - Conflict-Averse Gradient Descent for Multi-task Learning [56.379937772617]
A major challenge in optimizing a multi-task model is the conflicting gradients.
We introduce Conflict-Averse Gradient descent (CAGrad) which minimizes the average loss function.
CAGrad balances the objectives automatically and still provably converges to a minimum over the average loss.
arXiv Detail & Related papers (2021-10-26T22:03:51Z) - Instance-Level Task Parameters: A Robust Multi-task Weighting Framework [17.639472693362926]
Recent works have shown that deep neural networks benefit from multi-task learning by learning a shared representation across several related tasks.
We let the training process dictate the optimal weighting of tasks for every instance in the dataset.
We conduct extensive experiments on SURREAL and CityScapes datasets, for human shape and pose estimation, depth estimation and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-11T02:35:42Z) - Learning Task Decomposition with Ordered Memory Policy Network [73.3813423684999]
We propose Ordered Memory Policy Network (OMPN) to discover subtask hierarchy by learning from demonstration.
OMPN can be applied to partially observable environments and still achieve higher task decomposition performance.
Our visualization confirms that the subtask hierarchy can emerge in our model.
arXiv Detail & Related papers (2021-03-19T18:13:35Z) - Hierarchical Reinforcement Learning By Discovering Intrinsic Options [18.041140234312934]
HIDIO can learn task-agnostic options in a self-supervised manner while jointly learning to utilize them to solve sparse-reward tasks.
In experiments on sparse-reward robotic manipulation and navigation tasks, HIDIO achieves higher success rates with greater sample efficiency.
arXiv Detail & Related papers (2021-01-16T20:54:31Z) - Learned Weight Sharing for Deep Multi-Task Learning by Natural Evolution
Strategy and Stochastic Gradient Descent [0.0]
We propose an algorithm to learn the assignment between a shared set of weights and task-specific layers.
Learning takes place via a combination of natural evolution strategy and gradient descent.
The end result are task-specific networks that share weights but allow independent inference.
arXiv Detail & Related papers (2020-03-23T10:21:44Z) - Generalized Hindsight for Reinforcement Learning [154.0545226284078]
We argue that low-reward data collected while trying to solve one task provides little to no signal for solving that particular task.
We present Generalized Hindsight: an approximate inverse reinforcement learning technique for relabeling behaviors with the right tasks.
arXiv Detail & Related papers (2020-02-26T18:57:05Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.