ILeSiA: Interactive Learning of Robot Situational Awareness from Camera Input
- URL: http://arxiv.org/abs/2409.20173v3
- Date: Mon, 08 Sep 2025 12:56:14 GMT
- Title: ILeSiA: Interactive Learning of Robot Situational Awareness from Camera Input
- Authors: Petr Vanc, Giovanni Franzese, Jan Kristof Behrens, Cosimo Della Santina, Karla Stepanova, Jens Kober, Robert Babuska,
- Abstract summary: This paper focuses on teaching the robot situational awareness by using a camera input and labeling frames as safe or risky.<n>The proposed method can reliably detect both known and novel faults using only a single example for each new fault.<n>Our method enables the next generation of cobots to be rapidly deployed with easy-to-set-up, vision-based risk assessment.
- Score: 13.374743857520487
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning from demonstration is a promising approach for teaching robots new skills. However, a central challenge in the execution of acquired skills is the ability to recognize faults and prevent failures. This is essential because demonstrations typically cover only a limited set of scenarios and often only the successful ones. During task execution, unforeseen situations may arise, such as changes in the robot's environment or interaction with human operators. To recognize such situations, this paper focuses on teaching the robot situational awareness by using a camera input and labeling frames as safe or risky. We train a Gaussian Process (GP) regression model fed by a low-dimensional latent space representation of the input images. The model outputs a continuous risk score ranging from zero to one, quantifying the degree of risk at each timestep. This allows for pausing task execution in unsafe situations and directly adding new training data, labeled by the human user. Our experiments on a robotic manipulator show that the proposed method can reliably detect both known and novel faults using only a single example for each new fault. In contrast, a standard multi-layer perceptron (MLP) performs well only on faults it has encountered during training. Our method enables the next generation of cobots to be rapidly deployed with easy-to-set-up, vision-based risk assessment, proactively safeguarding humans and detecting misaligned parts or missing objects before failures occur. We provide all the code and data required to reproduce our experiments at imitrob.ciirc.cvut.cz/publications/ilesia.
Related papers
- Low Resource Reconstruction Attacks Through Benign Prompts [12.077836270816622]
We devise a new attack that requires low resources, assumes little to no access to the actual training set, and identifies, seemingly, benign prompts that lead to potentially-risky image reconstruction.<n>This highlights the risk that images might even be reconstructed by an uninformed user and unintentionally.
arXiv Detail & Related papers (2025-07-10T17:32:26Z) - RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors [57.81012948133832]
We present RAID (Robust evaluation of AI-generated image Detectors), a dataset of 72k diverse and highly transferable adversarial examples.<n>Our methodology generates adversarial images that transfer with a high success rate to unseen detectors.<n>Our findings indicate that current state-of-the-art AI-generated image detectors can be easily deceived by adversarial examples.
arXiv Detail & Related papers (2025-06-04T14:16:00Z) - Safe Uncertainty-Aware Learning of Robotic Suturing [0.7864304771129751]
Minimal-Assistedly Invasive Surgery is currently fully manually controlled by a trained surgeon.<n>Recent works have utilized Artificial Intelligence methods, which show promising adaptability.<n>This paper presents a framework for a safe, uncertainty-aware learning method.
arXiv Detail & Related papers (2025-05-22T12:31:18Z) - MLLM-as-a-Judge for Image Safety without Human Labeling [81.24707039432292]
In the age of AI-generated content (AIGC), many image generation models are capable of producing harmful content.
It is crucial to identify such unsafe images based on established safety rules.
Existing approaches typically fine-tune MLLMs with human-labeled datasets.
arXiv Detail & Related papers (2024-12-31T00:06:04Z) - Don't Let Your Robot be Harmful: Responsible Robotic Manipulation [57.70648477564976]
Unthinking execution of human instructions in robotic manipulation can lead to severe safety risks.
We present Safety-as-policy, which includes (i) a world model to automatically generate scenarios containing safety risks and conduct virtual interactions, and (ii) a mental model to infer consequences with reflections.
We show that Safety-as-policy can avoid risks and efficiently complete tasks in both synthetic dataset and real-world experiments.
arXiv Detail & Related papers (2024-11-27T12:27:50Z) - LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts [88.96201324719205]
Safety concerns in large language models (LLMs) have gained significant attention due to their exposure to potentially harmful data during pre-training.<n>We identify a new safety vulnerability in LLMs, where seemingly benign prompts, semantically related to harmful content, can bypass safety mechanisms.<n>We introduce a novel attack method, textitActorBreaker, which identifies actors related to toxic prompts within pre-training distribution.
arXiv Detail & Related papers (2024-10-14T16:41:49Z) - Model-Based Runtime Monitoring with Interactive Imitation Learning [30.70994322652745]
This work aims to endow a robot with the ability to monitor and detect errors during task execution.
We introduce a model-based runtime monitoring algorithm that learns from deployment data to detect system anomalies and anticipate failures.
Our method outperforms the baselines across system-level and unit-test metrics, with 23% and 40% higher success rates in simulation and on physical hardware.
arXiv Detail & Related papers (2023-10-26T16:45:44Z) - XSkill: Cross Embodiment Skill Discovery [41.624343257852146]
XSkill is an imitation learning framework that discovers a cross-embodiment representation called skill prototypes purely from unlabeled human and robot manipulation videos.
Our experiments in simulation and real-world environments show that the discovered skill prototypes facilitate skill transfer and composition for unseen tasks.
arXiv Detail & Related papers (2023-07-19T12:51:28Z) - SignBERT+: Hand-model-aware Self-supervised Pre-training for Sign
Language Understanding [132.78015553111234]
Hand gesture serves as a crucial role during the expression of sign language.
Current deep learning based methods for sign language understanding (SLU) are prone to over-fitting due to insufficient sign data resource.
We propose the first self-supervised pre-trainable SignBERT+ framework with model-aware hand prior incorporated.
arXiv Detail & Related papers (2023-05-08T17:16:38Z) - Distributional Instance Segmentation: Modeling Uncertainty and High
Confidence Predictions with Latent-MaskRCNN [77.0623472106488]
In this paper, we explore a class of distributional instance segmentation models using latent codes.
For robotic picking applications, we propose a confidence mask method to achieve the high precision necessary.
We show that our method can significantly reduce critical errors in robotic systems, including our newly released dataset of ambiguous scenes.
arXiv Detail & Related papers (2023-05-03T05:57:29Z) - Learning Transferable Pedestrian Representation from Multimodal
Information Supervision [174.5150760804929]
VAL-PAT is a novel framework that learns transferable representations to enhance various pedestrian analysis tasks with multimodal information.
We first perform pre-training on LUPerson-TA dataset, where each image contains text and attribute annotations.
We then transfer the learned representations to various downstream tasks, including person reID, person attribute recognition and text-based person search.
arXiv Detail & Related papers (2023-04-12T01:20:58Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - Label Flipping Data Poisoning Attack Against Wearable Human Activity
Recognition System [0.5284812806199193]
This paper presents the design of a label flipping data poisoning attack for a Human Activity Recognition (HAR) system.
Due to high noise and uncertainty in the sensing environment, such an attack poses a severe threat to the recognition system.
This paper shades light on how to carry out the attack in practice through smartphone-based sensor data collection applications.
arXiv Detail & Related papers (2022-08-17T17:52:13Z) - Contrastive Learning for Cross-Domain Open World Recognition [17.660958043781154]
The ability to evolve is fundamental for any valuable autonomous agent whose knowledge cannot remain limited to that injected by the manufacturer.
We show how it learns a feature space perfectly suitable to incrementally include new classes and is able to capture knowledge which generalizes across a variety of visual domains.
Our method is endowed with a tailored effective stopping criterion for each learning episode and exploits a novel self-paced thresholding strategy.
arXiv Detail & Related papers (2022-03-17T11:23:53Z) - Continual Learning from Demonstration of Robotics Skills [5.573543601558405]
Methods for teaching motion skills to robots focus on training for a single skill at a time.
We propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equation solvers.
arXiv Detail & Related papers (2022-02-14T16:26:52Z) - What Can I Do Here? Learning New Skills by Imagining Visual Affordances [128.65223577406587]
We show how generative models of possible outcomes can allow a robot to learn visual representations of affordances.
In effect, prior data is used to learn what kinds of outcomes may be possible, such that when the robot encounters an unfamiliar setting, it can sample potential outcomes from its model.
We show that visuomotor affordance learning (VAL) can be used to train goal-conditioned policies that operate on raw image inputs.
arXiv Detail & Related papers (2021-06-01T17:58:02Z) - Backdoor Attacks on Self-Supervised Learning [22.24046752858929]
We show that self-supervised learning methods are vulnerable to backdoor attacks.
An attacker poisons a part of the unlabeled data by adding a small trigger (known to the attacker) to the images.
We propose a knowledge distillation based defense algorithm that succeeds in neutralizing the attack.
arXiv Detail & Related papers (2021-05-21T04:22:05Z) - Adversarial Training is Not Ready for Robot Learning [55.493354071227174]
Adversarial training is an effective method to train deep learning models that are resilient to norm-bounded perturbations.
We show theoretically and experimentally that neural controllers obtained via adversarial training are subjected to three types of defects.
Our results suggest that adversarial training is not yet ready for robot learning.
arXiv Detail & Related papers (2021-03-15T07:51:31Z) - Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query.
Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories.
We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z) - Learning the Noise of Failure: Intelligent System Tests for Robots [1.713291434132985]
We propose a simulated noise estimate for the detection of failures in automated system tests of robots.
The technique may empower real-world automated system tests without human evaluation of success or failure.
arXiv Detail & Related papers (2021-02-16T11:06:45Z) - Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings.
We develop an algorithm to train the policy iteratively on new data collected by the system.
We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z) - Dos and Don'ts of Machine Learning in Computer Security [74.1816306998445]
Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance.
We identify common pitfalls in the design, implementation, and evaluation of learning-based security systems.
We propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible.
arXiv Detail & Related papers (2020-10-19T13:09:31Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z) - SQUIRL: Robust and Efficient Learning from Video Demonstration of
Long-Horizon Robotic Manipulation Tasks [8.756012472587601]
Deep reinforcement learning (RL) can be used to learn complex manipulation tasks.
RL requires the robot to collect a large amount of real-world experience.
S SQUIRL performs a new but related long-horizon task robustly given only a single video demonstration.
arXiv Detail & Related papers (2020-03-10T20:26:26Z) - Exploiting Ergonomic Priors in Human-to-Robot Task Transfer [3.60953887026184]
A method based on programming by demonstration is proposed to learn null space policies from constrained motion data.
The effectiveness of the method has been demonstrated in a 3-link simulation and a real world experiment using a human subject as the demonstrator.
The approach is shown to outperform the current state-of-the-art approach in a simulated 3DoF robot manipulator control problem.
arXiv Detail & Related papers (2020-03-01T18:30:57Z) - Exploratory Machine Learning with Unknown Unknowns [60.78953456742171]
We study a new problem setting in which there are unknown classes in the training data misperceived as other labels.
We propose the exploratory machine learning, which examines and investigates training data by actively augmenting the feature space to discover potentially hidden classes.
arXiv Detail & Related papers (2020-02-05T02:06:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.