Interactive incremental learning of generalizable skills with local trajectory modulation
- URL: http://arxiv.org/abs/2409.05655v1
- Date: Mon, 9 Sep 2024 14:22:19 GMT
- Title: Interactive incremental learning of generalizable skills with local trajectory modulation
- Authors: Markus Knauer, Alin Albu-Schäffer, Freek Stulp, João Silvério,
- Abstract summary: We propose an interactive imitation learning framework that simultaneously leverages local and global modulations of trajectory distributions.
Our approach exploits the concept of via-points to incrementally and interactively 1) improve the model accuracy locally, 2) add new objects to the task during execution and 3) extend the skill into regions where demonstrations were not provided.
- Score: 14.416251854298409
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The problem of generalization in learning from demonstration (LfD) has received considerable attention over the years, particularly within the context of movement primitives, where a number of approaches have emerged. Recently, two important approaches have gained recognition. While one leverages via-points to adapt skills locally by modulating demonstrated trajectories, another relies on so-called task-parameterized models that encode movements with respect to different coordinate systems, using a product of probabilities for generalization. While the former are well-suited to precise, local modulations, the latter aim at generalizing over large regions of the workspace and often involve multiple objects. Addressing the quality of generalization by leveraging both approaches simultaneously has received little attention. In this work, we propose an interactive imitation learning framework that simultaneously leverages local and global modulations of trajectory distributions. Building on the kernelized movement primitives (KMP) framework, we introduce novel mechanisms for skill modulation from direct human corrective feedback. Our approach particularly exploits the concept of via-points to incrementally and interactively 1) improve the model accuracy locally, 2) add new objects to the task during execution and 3) extend the skill into regions where demonstrations were not provided. We evaluate our method on a bearing ring-loading task using a torque-controlled, 7-DoF, DLR SARA robot.
Related papers
- Local-Global Attention: An Adaptive Mechanism for Multi-Scale Feature Integration [0.9790236766474198]
Local-Global Attention is designed to better integrate both local and global contextual features.
We have thoroughly evaluated the Local-Global Attention mechanism on several widely used object detection and classification datasets.
arXiv Detail & Related papers (2024-11-14T17:22:16Z) - Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - RING#: PR-by-PE Global Localization with Roto-translation Equivariant Gram Learning [20.688641105430467]
Global localization is crucial in autonomous driving and robotics applications when GPS signals are unreliable.
Most approaches achieve global localization by sequential place recognition (PR) and pose estimation (PE)
We introduce a new paradigm, PR-by-PE localization, which bypasses the need for separate place recognition by directly deriving it from pose estimation.
We propose RING#, an end-to-end PR-by-PE localization network that operates in the bird's-eye-view (BEV) space, compatible with both vision and LiDAR sensors.
arXiv Detail & Related papers (2024-08-30T18:42:53Z) - Learning Where to Look: Self-supervised Viewpoint Selection for Active Localization using Geometrical Information [68.10033984296247]
This paper explores the domain of active localization, emphasizing the importance of viewpoint selection to enhance localization accuracy.
Our contributions involve using a data-driven approach with a simple architecture designed for real-time operation, a self-supervised data training method, and the capability to consistently integrate our map into a planning framework tailored for real-world robotics applications.
arXiv Detail & Related papers (2024-07-22T12:32:09Z) - Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation [52.87672306545577]
Existing motion generation methods primarily focus on the direct synthesis of global motions.
We propose the local action-guided motion diffusion model, which facilitates global motion generation by utilizing local actions as fine-grained control signals.
Our method provides flexibility in seamlessly combining various local actions and continuous guiding weight adjustment.
arXiv Detail & Related papers (2024-07-15T08:35:00Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Global-local Motion Transformer for Unsupervised Skeleton-based Action
Learning [23.051184131833292]
We propose a new transformer model for the task of unsupervised learning of skeleton motion sequences.
The proposed model successfully learns local dynamics of the joints and captures global context from the motion sequences.
arXiv Detail & Related papers (2022-07-13T10:18:07Z) - Cross-modal Consensus Network for Weakly Supervised Temporal Action
Localization [74.34699679568818]
Weakly supervised temporal action localization (WS-TAL) is a challenging task that aims to localize action instances in the given video with video-level categorical supervision.
We propose a cross-modal consensus network (CO2-Net) to tackle this problem.
arXiv Detail & Related papers (2021-07-27T04:21:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.