One-Shot Informed Robotic Visual Search in the Wild
- URL: http://arxiv.org/abs/2003.10010v2
- Date: Thu, 3 Sep 2020 07:14:23 GMT
- Title: One-Shot Informed Robotic Visual Search in the Wild
- Authors: Karim Koreitem, Florian Shkurti, Travis Manderson, Wei-Di Chang, Juan
Camilo Gamboa Higuera, Gregory Dudek
- Abstract summary: We consider the task of underwater robot navigation for the purpose of collecting scientifically relevant video data for environmental monitoring.
The majority of field robots currently perform monitoring tasks in unstructured natural environments navigate via path-tracking a pre-specified sequence of waypoints.
We propose a method that enables informed visual navigation via a learned visual similarity operator that guides the robot's visual search towards parts of the scene that look like exemplar images.
- Score: 29.604267552742026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the task of underwater robot navigation for the purpose of
collecting scientifically relevant video data for environmental monitoring. The
majority of field robots that currently perform monitoring tasks in
unstructured natural environments navigate via path-tracking a pre-specified
sequence of waypoints. Although this navigation method is often necessary, it
is limiting because the robot does not have a model of what the scientist deems
to be relevant visual observations. Thus, the robot can neither visually search
for particular types of objects, nor focus its attention on parts of the scene
that might be more relevant than the pre-specified waypoints and viewpoints. In
this paper we propose a method that enables informed visual navigation via a
learned visual similarity operator that guides the robot's visual search
towards parts of the scene that look like an exemplar image, which is given by
the user as a high-level specification for data collection. We propose and
evaluate a weakly supervised video representation learning method that
outperforms ImageNet embeddings for similarity tasks in the underwater domain.
We also demonstrate the deployment of this similarity operator during informed
visual navigation in collaborative environmental monitoring scenarios, in
large-scale field trials, where the robot and a human scientist collaboratively
search for relevant visual content.
Related papers
- Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris.
Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models.
We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z) - CoNav: A Benchmark for Human-Centered Collaborative Navigation [66.6268966718022]
We propose a collaborative navigation (CoNav) benchmark.
Our CoNav tackles the critical challenge of constructing a 3D navigation environment with realistic and diverse human activities.
We propose an intention-aware agent for reasoning both long-term and short-term human intention.
arXiv Detail & Related papers (2024-06-04T15:44:25Z) - CARPE-ID: Continuously Adaptable Re-identification for Personalized
Robot Assistance [16.948256303861022]
In today's Human-Robot Interaction (HRI) scenarios, a prevailing tendency exists to assume that the robot shall cooperate with the closest individual.
We propose a person re-identification module based on continual visual adaptation techniques.
We test the framework singularly using recorded videos in a laboratory environment and an HRI scenario by a mobile robot.
arXiv Detail & Related papers (2023-10-30T10:24:21Z) - Learning Hierarchical Interactive Multi-Object Search for Mobile
Manipulation [10.21450780640562]
We introduce a novel interactive multi-object search task in which a robot has to open doors to navigate rooms and search inside cabinets and drawers to find target objects.
These new challenges require combining manipulation and navigation skills in unexplored environments.
We present HIMOS, a hierarchical reinforcement learning approach that learns to compose exploration, navigation, and manipulation skills.
arXiv Detail & Related papers (2023-07-12T12:25:33Z) - Learning Video-Conditioned Policies for Unseen Manipulation Tasks [83.2240629060453]
Video-conditioned Policy learning maps human demonstrations of previously unseen tasks to robot manipulation skills.
We learn our policy to generate appropriate actions given current scene observations and a video of the target task.
We validate our approach on a set of challenging multi-task robot manipulation environments and outperform state of the art.
arXiv Detail & Related papers (2023-05-10T16:25:42Z) - Embodied Agents for Efficient Exploration and Smart Scene Description [47.82947878753809]
We tackle a setting for visual navigation in which an autonomous agent needs to explore and map an unseen indoor environment.
We propose and evaluate an approach that combines recent advances in visual robotic exploration and image captioning on images.
Our approach can generate smart scene descriptions that maximize semantic knowledge of the environment and avoid repetitions.
arXiv Detail & Related papers (2023-01-17T19:28:01Z) - Challenges in Visual Anomaly Detection for Mobile Robots [65.53820325712455]
We consider the task of detecting anomalies for autonomous mobile robots based on vision.
We categorize relevant types of visual anomalies and discuss how they can be detected by unsupervised deep learning methods.
arXiv Detail & Related papers (2022-09-22T13:26:46Z) - Rapid Exploration for Open-World Navigation with Latent Goal Models [78.45339342966196]
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments.
At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images.
We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration.
arXiv Detail & Related papers (2021-04-12T23:14:41Z) - ViNG: Learning Open-World Navigation with Visual Goals [82.84193221280216]
We propose a learning-based navigation system for reaching visually indicated goals.
We show that our system, which we call ViNG, outperforms previously-proposed methods for goal-conditioned reinforcement learning.
We demonstrate ViNG on a number of real-world applications, such as last-mile delivery and warehouse inspection.
arXiv Detail & Related papers (2020-12-17T18:22:32Z) - Co-training for On-board Deep Object Detection [0.0]
Best performing deep vision-based object detectors are trained in a supervised manner by relying on human-labeled bounding boxes.
Co-training is a semi-supervised learning method for self-labeling objects in unlabeled images.
We show how co-training is a paradigm worth to pursue for alleviating object labeling, working both alone and together with task-agnostic domain adaptation.
arXiv Detail & Related papers (2020-08-12T19:08:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.