Active Vision Reinforcement Learning under Limited Visual Observability
- URL: http://arxiv.org/abs/2306.00975v2
- Date: Mon, 6 Nov 2023 00:12:04 GMT
- Title: Active Vision Reinforcement Learning under Limited Visual Observability
- Authors: Jinghuan Shang and Michael S. Ryoo
- Abstract summary: We investigate Active Vision Reinforcement Learning (ActiveVision-RL) where an embodied agent simultaneously learns action policy for the task while also controlling its visual observations in partially observable environments.
We propose SUGARL, Sensorimotor Understanding Guided Active Reinforcement Learning, a framework that models motor and sensory policies separately, but jointly learns them using with an intrinsic sensorimotor reward.
- Score: 46.99501921691587
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we investigate Active Vision Reinforcement Learning
(ActiveVision-RL), where an embodied agent simultaneously learns action policy
for the task while also controlling its visual observations in partially
observable environments. We denote the former as motor policy and the latter as
sensory policy. For example, humans solve real world tasks by hand manipulation
(motor policy) together with eye movements (sensory policy). ActiveVision-RL
poses challenges on coordinating two policies given their mutual influence. We
propose SUGARL, Sensorimotor Understanding Guided Active Reinforcement
Learning, a framework that models motor and sensory policies separately, but
jointly learns them using with an intrinsic sensorimotor reward. This learnable
reward is assigned by sensorimotor reward module, incentivizes the sensory
policy to select observations that are optimal to infer its own motor action,
inspired by the sensorimotor stage of humans. Through a series of experiments,
we show the effectiveness of our method across a range of observability
conditions and its adaptability to existed RL algorithms. The sensory policies
learned through our method are observed to exhibit effective active vision
strategies.
Related papers
- Emergent Active Perception and Dexterity of Simulated Humanoids from Visual Reinforcement Learning [69.71072181304066]
We introduce Perceptive Dexterous Control (PDC), a framework for vision-driven whole-body control with simulated humanoids.<n>PDC operates solely on egocentric vision for task specification, enabling object search, target placement, and skill selection through visual cues.<n>We show that training from scratch with reinforcement learning can produce emergent behaviors such as active search.
arXiv Detail & Related papers (2025-05-18T07:33:31Z) - The Power of the Senses: Generalizable Manipulation from Vision and
Touch through Masked Multimodal Learning [60.91637862768949]
We propose Masked Multimodal Learning (M3L) to fuse visual and tactile information in a reinforcement learning setting.
M3L learns a policy and visual-tactile representations based on masked autoencoding.
We evaluate M3L on three simulated environments with both visual and tactile observations.
arXiv Detail & Related papers (2023-11-02T01:33:00Z) - Learning Deep Sensorimotor Policies for Vision-based Autonomous Drone
Racing [52.50284630866713]
Existing systems often require hand-engineered components for state estimation, planning, and control.
This paper tackles the vision-based autonomous-drone-racing problem by learning deep sensorimotor policies.
arXiv Detail & Related papers (2022-10-26T19:03:17Z) - CADRE: A Cascade Deep Reinforcement Learning Framework for Vision-based
Autonomous Urban Driving [43.269130988225605]
Vision-based autonomous urban driving in dense traffic is quite challenging due to the complicated urban environment and the dynamics of the driving behaviors.
We present a novel CAscade Deep REinforcement learning framework, CADRE, to achieve model-free vision-based autonomous urban driving.
arXiv Detail & Related papers (2022-02-17T10:07:16Z) - Learning Perceptual Locomotion on Uneven Terrains using Sparse Visual
Observations [75.60524561611008]
This work aims to exploit the use of sparse visual observations to achieve perceptual locomotion over a range of commonly seen bumps, ramps, and stairs in human-centred environments.
We first formulate the selection of minimal visual input that can represent the uneven surfaces of interest, and propose a learning framework that integrates such exteroceptive and proprioceptive data.
We validate the learned policy in tasks that require omnidirectional walking over flat ground and forward locomotion over terrains with obstacles, showing a high success rate.
arXiv Detail & Related papers (2021-09-28T20:25:10Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z) - VATLD: A Visual Analytics System to Assess, Understand and Improve
Traffic Light Detection [15.36267013724161]
We propose a visual analytics system, VATLD, to assess, understand, and improve the accuracy and robustness of traffic light detectors in autonomous driving applications.
The disentangled representation learning extracts data semantics to augment human cognition with human-friendly visual summarization.
We also demonstrate the effectiveness of various performance improvement strategies with our visual analytics system, VATLD, and illustrate some practical implications for safety-critical applications in autonomous driving.
arXiv Detail & Related papers (2020-09-27T22:39:00Z) - Active Perception and Representation for Robotic Manipulation [0.8315801422499861]
We present a framework that leverages the benefits of active perception to accomplish manipulation tasks.
Our agent uses viewpoint changes to localize objects, to learn state representations in a self-supervised manner, and to perform goal-directed actions.
Compared to vanilla deep Q-learning algorithms, our model is at least four times more sample-efficient.
arXiv Detail & Related papers (2020-03-15T01:43:51Z) - Mutual Information-based State-Control for Intrinsically Motivated
Reinforcement Learning [102.05692309417047]
In reinforcement learning, an agent learns to reach a set of goals by means of an external reward signal.
In the natural world, intelligent organisms learn from internal drives, bypassing the need for external signals.
We propose to formulate an intrinsic objective as the mutual information between the goal states and the controllable states.
arXiv Detail & Related papers (2020-02-05T19:21:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.