Experience-driven discovery of planning strategies
- URL: http://arxiv.org/abs/2412.03111v1
- Date: Wed, 04 Dec 2024 08:20:03 GMT
- Title: Experience-driven discovery of planning strategies
- Authors: Ruiqi He, Falk Lieder,
- Abstract summary: We show that new planning strategies are discovered through metacognitive reinforcement learning.
When fitted to human data, these models exhibit a slower discovery rate than humans, leaving room for improvement.
- Score: 0.9821874476902969
- License:
- Abstract: One explanation for how people can plan efficiently despite limited cognitive resources is that we possess a set of adaptive planning strategies and know when and how to use them. But how are these strategies acquired? While previous research has studied how individuals learn to choose among existing strategies, little is known about the process of forming new planning strategies. In this work, we propose that new planning strategies are discovered through metacognitive reinforcement learning. To test this, we designed a novel experiment to investigate the discovery of new planning strategies. We then present metacognitive reinforcement learning models and demonstrate their capability for strategy discovery as well as show that they provide a better explanation of human strategy discovery than alternative learning mechanisms. However, when fitted to human data, these models exhibit a slower discovery rate than humans, leaving room for improvement.
Related papers
- Program-Based Strategy Induction for Reinforcement Learning [5.657991642023959]
We use Bayesian program induction to discover strategies implemented by programs, letting the simplicity of strategies trade off against their effectiveness.
We find strategies that are difficult or unexpected with classical incremental learning, like asymmetric learning from rewarded and unrewarded trials, adaptive horizon-dependent random exploration, and discrete state switching.
arXiv Detail & Related papers (2024-02-26T15:40:46Z) - Risk-reducing design and operations toolkit: 90 strategies for managing
risk and uncertainty in decision problems [65.268245109828]
This paper develops a catalog of such strategies and develops a framework for them.
It argues that they provide an efficient response to decision problems that are seemingly intractable due to high uncertainty.
It then proposes a framework to incorporate them into decision theory using multi-objective optimization.
arXiv Detail & Related papers (2023-09-06T16:14:32Z) - Strategy Extraction in Single-Agent Games [0.19336815376402716]
We propose an approach to knowledge transfer using behavioural strategies as a form of transferable knowledge influenced by the human cognitive ability to develop strategies.
We show that our method can identify plausible strategies in three environments: Pacman, Bank Heist and a dungeon-crawling video game.
arXiv Detail & Related papers (2023-05-22T01:28:59Z) - Anti-Retroactive Interference for Lifelong Learning [65.50683752919089]
We design a paradigm for lifelong learning based on meta-learning and associative mechanism of the brain.
It tackles the problem from two aspects: extracting knowledge and memorizing knowledge.
It is theoretically analyzed that the proposed learning paradigm can make the models of different tasks converge to the same optimum.
arXiv Detail & Related papers (2022-08-27T09:27:36Z) - A Closer Look at Knowledge Distillation with Features, Logits, and
Gradients [81.39206923719455]
Knowledge distillation (KD) is a substantial strategy for transferring learned knowledge from one neural network model to another.
This work provides a new perspective to motivate a set of knowledge distillation strategies by approximating the classical KL-divergence criteria with different knowledge sources.
Our analysis indicates that logits are generally a more efficient knowledge source and suggests that having sufficient feature dimensions is crucial for the model design.
arXiv Detail & Related papers (2022-03-18T21:26:55Z) - Have I done enough planning or should I plan more? [0.7734726150561086]
We show that people acquire this ability through learning and reverse-engineer the underlying learning mechanisms.
We find that people quickly adapt how much planning they perform to the cost and benefit of planning.
Our results suggest that the metacognitive ability to adjust the amount of planning might be learned through a policy-gradient mechanism.
arXiv Detail & Related papers (2022-01-03T17:11:07Z) - Procedure Planning in Instructional Videosvia Contextual Modeling and
Model-based Policy Learning [114.1830997893756]
This work focuses on learning a model to plan goal-directed actions in real-life videos.
We propose novel algorithms to model human behaviors through Bayesian Inference and model-based Imitation Learning.
arXiv Detail & Related papers (2021-10-05T01:06:53Z) - Automatic discovery and description of human planning strategies [0.7734726150561086]
We leverage AI for strategy discovery for understanding human planning.
Our algorithm, called Human-Interpret, uses imitation learning to describe process-tracing data.
We find that the descriptions of human planning strategies obtained automatically are about as understandable as human-generated descriptions.
arXiv Detail & Related papers (2021-09-29T15:20:16Z) - Insights into Data through Model Behaviour: An Explainability-driven
Strategy for Data Auditing for Responsible Computer Vision Applications [70.92379567261304]
This study explores an explainability-driven strategy to data auditing.
We demonstrate this strategy by auditing two popular medical benchmark datasets.
We discover hidden data quality issues that lead deep learning models to make predictions for the wrong reasons.
arXiv Detail & Related papers (2021-06-16T23:46:39Z) - Improving Human Decision-Making by Discovering Efficient Strategies for
Hierarchical Planning [0.6882042556551609]
People need efficient planning strategies because their computational resources are limited.
Our ability to compute those strategies used to be limited to very small and very simple planning tasks.
We introduce a cognitively-inspired reinforcement learning method that can overcome this limitation.
arXiv Detail & Related papers (2021-01-31T19:46:00Z) - Latent Skill Planning for Exploration and Transfer [49.25525932162891]
In this paper, we investigate how these two approaches can be integrated into a single reinforcement learning agent.
We leverage the idea of partial amortization for fast adaptation at test time.
We demonstrate the benefits of our design decisions across a suite of challenging locomotion tasks.
arXiv Detail & Related papers (2020-11-27T18:40:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.