Learning Generalizable Behavior via Visual Rewrite Rules
- URL: http://arxiv.org/abs/2112.05218v1
- Date: Thu, 9 Dec 2021 21:23:26 GMT
- Title: Learning Generalizable Behavior via Visual Rewrite Rules
- Authors: Yiheng Xie, Mingxuan Li, Shangqun Yu, Michael Littman
- Abstract summary: In this paper, we propose a novel representation and learning approach to capture environment dynamics without using neural networks.
It originates from the observation that, in games designed for people, the effect of an action can often be perceived in the form of local changes in consecutive visual observations.
Our algorithm is designed to extract such vision-based changes and condense them into a set of action-dependent descriptive rules, which we call ''visual rewrite rules'' (VRRs)
- Score: 0.9558392439655015
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Though deep reinforcement learning agents have achieved unprecedented success
in recent years, their learned policies can be brittle, failing to generalize
to even slight modifications of their environments or unfamiliar situations.
The black-box nature of the neural network learning dynamics makes it
impossible to audit trained deep agents and recover from such failures. In this
paper, we propose a novel representation and learning approach to capture
environment dynamics without using neural networks. It originates from the
observation that, in games designed for people, the effect of an action can
often be perceived in the form of local changes in consecutive visual
observations. Our algorithm is designed to extract such vision-based changes
and condense them into a set of action-dependent descriptive rules, which we
call ''visual rewrite rules'' (VRRs). We also present preliminary results from
a VRR agent that can explore, expand its rule set, and solve a game via
planning with its learned VRR world model. In several classical games, our
non-deep agent demonstrates superior performance, extreme sample efficiency,
and robust generalization ability compared with several mainstream deep agents.
Related papers
- Learning Object-Centric Representation via Reverse Hierarchy Guidance [73.05170419085796]
Object-Centric Learning (OCL) seeks to enable Neural Networks to identify individual objects in visual scenes.
RHGNet introduces a top-down pathway that works in different ways in the training and inference processes.
Our model achieves SOTA performance on several commonly used datasets.
arXiv Detail & Related papers (2024-05-17T07:48:27Z) - Vision-Language Models Provide Promptable Representations for Reinforcement Learning [67.40524195671479]
We propose a novel approach that uses the vast amounts of general and indexable world knowledge encoded in vision-language models (VLMs) pre-trained on Internet-scale data for embodied reinforcement learning (RL)
We show that our approach can use chain-of-thought prompting to produce representations of common-sense semantic reasoning, improving policy performance in novel scenes by 1.5 times.
arXiv Detail & Related papers (2024-02-05T00:48:56Z) - Learning of Generalizable and Interpretable Knowledge in Grid-Based
Reinforcement Learning Environments [5.217870815854702]
We propose using program synthesis to imitate reinforcement learning policies.
We adapt the state-of-the-art program synthesis system DreamCoder for learning concepts in grid-based environments.
arXiv Detail & Related papers (2023-09-07T11:46:57Z) - Degraded Polygons Raise Fundamental Questions of Neural Network Perception [5.423100066629618]
We revisit the task of recovering images under degradation, first introduced over 30 years ago in the Recognition-by-Components theory of human vision.
We implement the Automated Shape Recoverability Test for rapidly generating large-scale datasets of perimeter-degraded regular polygons.
We find that neural networks' behavior on this simple task conflicts with human behavior.
arXiv Detail & Related papers (2023-06-08T06:02:39Z) - Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training.
We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z) - A Survey on Reinforcement Learning Methods in Character Animation [22.3342752080749]
Reinforcement Learning is an area of Machine Learning focused on how agents can be trained to make sequential decisions.
This paper surveys the modern Deep Reinforcement Learning methods and discusses their possible applications in Character Animation.
arXiv Detail & Related papers (2022-03-07T23:39:00Z) - Causal Navigation by Continuous-time Neural Networks [108.84958284162857]
We propose a theoretical and experimental framework for learning causal representations using continuous-time neural networks.
We evaluate our method in the context of visual-control learning of drones over a series of complex tasks.
arXiv Detail & Related papers (2021-06-15T17:45:32Z) - Generative Adversarial Reward Learning for Generalized Behavior Tendency
Inference [71.11416263370823]
We propose a generative inverse reinforcement learning for user behavioral preference modelling.
Our model can automatically learn the rewards from user's actions based on discriminative actor-critic network and Wasserstein GAN.
arXiv Detail & Related papers (2021-05-03T13:14:25Z) - Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity.
Our method leverages latent variable models to learn a representation of the environment from current and past experiences.
We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.