Related papers: Online Refinement of a Scene Recognition Model for Mobile Robots by Observing Human's Interaction with Environments

Online Refinement of a Scene Recognition Model for Mobile Robots by Observing Human's Interaction with Environments

URL: http://arxiv.org/abs/2208.06636v1
Date: Sat, 13 Aug 2022 12:48:18 GMT
Title: Online Refinement of a Scene Recognition Model for Mobile Robots by Observing Human's Interaction with Environments
Authors: Shigemichi Matsuzaki, Hiroaki Masuzawa, Jun Miura
Abstract summary: In scene recognition systems, misclassification may lead the robot to getting stuck due to the traversable plants recognized as obstacles. We propose a framework that allows for refining a semantic segmentation model on the fly during the robot's operation. We introduce a few-shot segmentation based on weight imprinting for online model refinement without fine-tuning.
Score: 2.127049691404299
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper describes a method of online refinement of a scene recognition model for robot navigation considering traversable plants, flexible plant parts which a robot can push aside while moving. In scene recognition systems that consider traversable plants growing out to the paths, misclassification may lead the robot to getting stuck due to the traversable plants recognized as obstacles. Yet, misclassification is inevitable in any estimation methods. In this work, we propose a framework that allows for refining a semantic segmentation model on the fly during the robot's operation. We introduce a few-shot segmentation based on weight imprinting for online model refinement without fine-tuning. Training data are collected via observation of a human's interaction with the plant parts. We propose novel robust weight imprinting to mitigate the effect of noise included in the masks generated by the interaction. The proposed method was evaluated through experiments using real-world data and shown to outperform an ordinary weight imprinting and provide competitive results to fine-tuning with model distillation while requiring less computational cost.

Related papers

Action Flow Matching for Continual Robot Learning [57.698553219660376]
Continual learning in robotics seeks systems that can constantly adapt to changing environments and tasks. We introduce a generative framework leveraging flow matching for online robot dynamics model alignment. We find that by transforming the actions themselves rather than exploring with a misaligned model, the robot collects informative data more efficiently.
arXiv Detail & Related papers (2025-04-25T16:26:15Z)
AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy Navigation [31.214318150001947]
Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks. We propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling. This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.
arXiv Detail & Related papers (2024-10-16T09:52:38Z)
Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris. Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models. We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z)
Navigating the Human Maze: Real-Time Robot Pathfinding with Generative Imitation Learning [0.0]
We introduce goal-conditioned autoregressive models to generate crowd behaviors, capturing intricate interactions among individuals. The model processes potential robot trajectory samples and predicts the reactions of surrounding individuals, enabling proactive robotic navigation in complex scenarios.
arXiv Detail & Related papers (2024-08-07T14:32:41Z)
Multimodal Anomaly Detection based on Deep Auto-Encoder for Object Slip Perception of Mobile Manipulation Robots [22.63980025871784]
The proposed framework integrates heterogeneous data streams collected from various robot sensors, including RGB and depth cameras, a microphone, and a force-torque sensor. The integrated data is used to train a deep autoencoder to construct latent representations of the multisensory data that indicate the normal status. Anomalies can then be identified by error scores measured by the difference between the trained encoder's latent values and the latent values of reconstructed input data.
arXiv Detail & Related papers (2024-03-06T09:15:53Z)
Distributional Instance Segmentation: Modeling Uncertainty and High Confidence Predictions with Latent-MaskRCNN [77.0623472106488]
In this paper, we explore a class of distributional instance segmentation models using latent codes. For robotic picking applications, we propose a confidence mask method to achieve the high precision necessary. We show that our method can significantly reduce critical errors in robotic systems, including our newly released dataset of ambiguous scenes.
arXiv Detail & Related papers (2023-05-03T05:57:29Z)
Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments. Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity. We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z)
Few-Shot Visual Grounding for Natural Human-Robot Interaction [0.0]
We propose a software architecture that segments a target object from a crowded scene, indicated verbally by a human user. At the core of our system, we employ a multi-modal deep neural network for visual grounding. We evaluate the performance of the proposed model on real RGB-D data collected from public scene datasets.
arXiv Detail & Related papers (2021-03-17T15:24:02Z)
Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic Platforms [60.59764170868101]
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform. We formulate it as a few-shot meta-learning problem where the goal is to find a model that captures the common structure shared across different robotic platforms. We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots.
arXiv Detail & Related papers (2021-03-05T14:16:20Z)
Online Body Schema Adaptation through Cost-Sensitive Active Learning [63.84207660737483]
The work was implemented in a simulation environment, using the 7DoF arm of the iCub robot simulator. A cost-sensitive active learning approach is used to select optimal joint configurations. The results show cost-sensitive active learning has similar accuracy to the standard active learning approach, while reducing in about half the executed movement.
arXiv Detail & Related papers (2021-01-26T16:01:02Z)
Learning a generative model for robot control using visual feedback [7.171234436165255]
We introduce a novel formulation for incorporating visual feedback in controlling robots. Inference in the model allows us to infer the robot state corresponding to target locations of the features. We demonstrate the effectiveness of our method by executing grasping and tight-fit insertions on robots with inaccurate controllers.
arXiv Detail & Related papers (2020-03-10T00:34:01Z)
Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works. However, learning a model that captures the dynamics of complex skills represents a major challenge. We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.