Maintaining a Reliable World Model using Action-aware Perceptual
Anchoring
- URL: http://arxiv.org/abs/2107.03038v1
- Date: Wed, 7 Jul 2021 06:35:14 GMT
- Title: Maintaining a Reliable World Model using Action-aware Perceptual
Anchoring
- Authors: Ying Siu Liang, Dongkyu Choi, Kenneth Kwok
- Abstract summary: There is a need for robots to maintain a model of its surroundings even when objects go out of view and are no longer visible.
This requires anchoring perceptual information onto symbols that represent the objects in the environment.
We present a model for action-aware perceptual anchoring that enables robots to track objects in a persistent manner.
- Score: 4.971403153199917
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reliable perception is essential for robots that interact with the world. But
sensors alone are often insufficient to provide this capability, and they are
prone to errors due to various conditions in the environment. Furthermore,
there is a need for robots to maintain a model of its surroundings even when
objects go out of view and are no longer visible. This requires anchoring
perceptual information onto symbols that represent the objects in the
environment. In this paper, we present a model for action-aware perceptual
anchoring that enables robots to track objects in a persistent manner. Our
rule-based approach considers inductive biases to perform high-level reasoning
over the results from low-level object detection, and it improves the robot's
perceptual capability for complex tasks. We evaluate our model against existing
baseline models for object permanence and show that it outperforms these on a
snitch localisation task using a dataset of 1,371 videos. We also integrate our
action-aware perceptual anchoring in the context of a cognitive architecture
and demonstrate its benefits in a realistic gearbox assembly task on a
Universal Robot.
Related papers
- Multimodal Anomaly Detection based on Deep Auto-Encoder for Object Slip
Perception of Mobile Manipulation Robots [22.63980025871784]
The proposed framework integrates heterogeneous data streams collected from various robot sensors, including RGB and depth cameras, a microphone, and a force-torque sensor.
The integrated data is used to train a deep autoencoder to construct latent representations of the multisensory data that indicate the normal status.
Anomalies can then be identified by error scores measured by the difference between the trained encoder's latent values and the latent values of reconstructed input data.
arXiv Detail & Related papers (2024-03-06T09:15:53Z) - Teaching Unknown Objects by Leveraging Human Gaze and Augmented Reality
in Human-Robot Interaction [3.1473798197405953]
This dissertation aims to teach a robot unknown objects in the context of Human-Robot Interaction (HRI)
The combination of eye tracking and Augmented Reality created a powerful synergy that empowered the human teacher to communicate with the robot.
The robot's object detection capabilities exhibited comparable performance to state-of-the-art object detectors trained on extensive datasets.
arXiv Detail & Related papers (2023-12-12T11:34:43Z) - FOCUS: Object-Centric World Models for Robotics Manipulation [4.6956495676681484]
FOCUS is a model-based agent that learns an object-centric world model.
We show that object-centric world models allow the agent to solve tasks more efficiently.
We also showcase how FOCUS could be adopted in real-world settings.
arXiv Detail & Related papers (2023-07-05T16:49:06Z) - Distributional Instance Segmentation: Modeling Uncertainty and High
Confidence Predictions with Latent-MaskRCNN [77.0623472106488]
In this paper, we explore a class of distributional instance segmentation models using latent codes.
For robotic picking applications, we propose a confidence mask method to achieve the high precision necessary.
We show that our method can significantly reduce critical errors in robotic systems, including our newly released dataset of ambiguous scenes.
arXiv Detail & Related papers (2023-05-03T05:57:29Z) - RT-1: Robotics Transformer for Real-World Control at Scale [98.09428483862165]
We present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties.
We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks.
arXiv Detail & Related papers (2022-12-13T18:55:15Z) - Object Manipulation via Visual Target Localization [64.05939029132394]
Training agents to manipulate objects, poses many challenges.
We propose an approach that explores the environment in search for target objects, computes their 3D coordinates once they are located, and then continues to estimate their 3D locations even when the objects are not visible.
Our evaluations show a massive 3x improvement in success rate over a model that has access to the same sensory suite.
arXiv Detail & Related papers (2022-03-15T17:59:01Z) - Improving Object Permanence using Agent Actions and Reasoning [8.847502932609737]
Existing approaches learn object permanence from low-level perception.
We argue that object permanence can be improved when the robot uses knowledge about executed actions.
arXiv Detail & Related papers (2021-10-01T07:09:49Z) - Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human
Videos [59.58105314783289]
Domain-agnostic Video Discriminator (DVD) learns multitask reward functions by training a discriminator to classify whether two videos are performing the same task.
DVD can generalize by virtue of learning from a small amount of robot data with a broad dataset of human videos.
DVD can be combined with visual model predictive control to solve robotic manipulation tasks on a real WidowX200 robot in an unseen environment from a single human demo.
arXiv Detail & Related papers (2021-03-31T05:25:05Z) - Object Detection and Pose Estimation from RGB and Depth Data for
Real-time, Adaptive Robotic Grasping [0.0]
We propose a system that performs real-time object detection and pose estimation, for the purpose of dynamic robot grasping.
The proposed approach allows the robot to detect the object identity and its actual pose, and then adapt a canonical grasp in order to be used with the new pose.
For training, the system defines a canonical grasp by capturing the relative pose of an object with respect to the gripper attached to the robot's wrist.
During testing, once a new pose is detected, a canonical grasp for the object is identified and then dynamically adapted by adjusting the robot arm's joint angles.
arXiv Detail & Related papers (2021-01-18T22:22:47Z) - Model-Based Visual Planning with Self-Supervised Functional Distances [104.83979811803466]
We present a self-supervised method for model-based visual goal reaching.
Our approach learns entirely using offline, unlabeled data.
We find that this approach substantially outperforms both model-free and model-based prior methods.
arXiv Detail & Related papers (2020-12-30T23:59:09Z) - Unadversarial Examples: Designing Objects for Robust Vision [100.4627585672469]
We develop a framework that exploits the sensitivity of modern machine learning algorithms to input perturbations in order to design "robust objects"
We demonstrate the efficacy of the framework on a wide variety of vision-based tasks ranging from standard benchmarks to (in-simulation) robotics.
arXiv Detail & Related papers (2020-12-22T18:26:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.