Active Sensing with Predictive Coding and Uncertainty Minimization
- URL: http://arxiv.org/abs/2307.00668v3
- Date: Tue, 13 Feb 2024 05:13:26 GMT
- Title: Active Sensing with Predictive Coding and Uncertainty Minimization
- Authors: Abdelrahman Sharafeldin, Nabil Imam, Hannah Choi
- Abstract summary: We present an end-to-end procedure for embodied exploration inspired by two biological computations.
We first demonstrate our approach in a maze navigation task and show that it can discover the underlying transition distributions and spatial features of the environment.
We show that our model builds unsupervised representations through exploration that allow it to efficiently categorize visual scenes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an end-to-end procedure for embodied exploration inspired by two
biological computations: predictive coding and uncertainty minimization. The
procedure can be applied to exploration settings in a task-independent and
intrinsically driven manner. We first demonstrate our approach in a maze
navigation task and show that it can discover the underlying transition
distributions and spatial features of the environment. Second, we apply our
model to a more complex active vision task, where an agent actively samples its
visual environment to gather information. We show that our model builds
unsupervised representations through exploration that allow it to efficiently
categorize visual scenes. We further show that using these representations for
downstream classification leads to superior data efficiency and learning speed
compared to other baselines while maintaining lower parameter complexity.
Finally, the modularity of our model allows us to probe its internal mechanisms
and analyze the interaction between perception and action during exploration.
Related papers
- A Robotics-Inspired Scanpath Model Reveals the Importance of Uncertainty and Semantic Object Cues for Gaze Guidance in Dynamic Scenes [8.64158103104882]
We present a mechanistic model that simulates object segmentation and gaze behavior for dynamic real-world scenes.
Our model uses the current scene segmentation for object-based saccadic decision-making while using the foveated object to refine its scene segmentation.
We show that our model's modular design allows for extensions, such as incorporating saccadic momentum or pre-saccadic attention.
arXiv Detail & Related papers (2024-08-02T15:20:34Z) - DeTra: A Unified Model for Object Detection and Trajectory Forecasting [68.85128937305697]
Our approach formulates the union of the two tasks as a trajectory refinement problem.
To tackle this unified task, we design a refinement transformer that infers the presence, pose, and multi-modal future behaviors of objects.
In our experiments, we observe that ourmodel outperforms the state-of-the-art on Argoverse 2 Sensor and Open dataset.
arXiv Detail & Related papers (2024-06-06T18:12:04Z) - Self-supervised Sequential Information Bottleneck for Robust Exploration
in Deep Reinforcement Learning [28.75574762244266]
In this work, we introduce the sequential information bottleneck objective for learning compressed and temporally coherent representations.
For efficient exploration in noisy environments, we further construct intrinsic rewards that capture task-relevant state novelty.
arXiv Detail & Related papers (2022-09-12T15:41:10Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - Self-Supervised Domain Adaptation for Visual Navigation with Global Map
Consistency [6.385006149689549]
We propose a self-supervised adaptation for a visual navigation agent to generalize to unseen environment.
The proposed task is completely self-supervised, not requiring any supervision from ground-truth pose data or explicit noise model.
Our experiments show that the proposed task helps the agent to successfully transfer to new, noisy environments.
arXiv Detail & Related papers (2021-10-14T07:14:36Z) - Glimpse-Attend-and-Explore: Self-Attention for Active Visual Exploration [47.01485765231528]
Active visual exploration aims to assist an agent with a limited field of view to understand its environment based on partial observations.
We propose the Glimpse-Attend-and-Explore model which employs self-attention to guide the visual exploration instead of task-specific uncertainty maps.
Our model provides encouraging results while being less dependent on dataset bias in driving the exploration.
arXiv Detail & Related papers (2021-08-26T11:41:03Z) - Self-supervised Video Object Segmentation by Motion Grouping [79.13206959575228]
We develop a computer vision system able to segment objects by exploiting motion cues.
We introduce a simple variant of the Transformer to segment optical flow frames into primary objects and the background.
We evaluate the proposed architecture on public benchmarks (DAVIS2016, SegTrackv2, and FBMS59)
arXiv Detail & Related papers (2021-04-15T17:59:32Z) - Embodied Visual Active Learning for Semantic Segmentation [33.02424587900808]
We study the task of embodied visual active learning, where an agent is set to explore a 3d environment with the goal to acquire visual scene understanding.
We develop a battery of agents - both learnt and pre-specified - and with different levels of knowledge of the environment.
We extensively evaluate the proposed models using the Matterport3D simulator and show that a fully learnt method outperforms comparable pre-specified counterparts.
arXiv Detail & Related papers (2020-12-17T11:02:34Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.