Multimodal and Force-Matched Imitation Learning with a See-Through Visuotactile Sensor
- URL: http://arxiv.org/abs/2311.01248v3
- Date: Wed, 26 Jun 2024 17:40:14 GMT
- Title: Multimodal and Force-Matched Imitation Learning with a See-Through Visuotactile Sensor
- Authors: Trevor Ablett, Oliver Limoyo, Adam Sigal, Affan Jilani, Jonathan Kelly, Kaleem Siddiqi, Francois Hogan, Gregory Dudek,
- Abstract summary: We leverage a multimodal visuotactile sensor within the framework of imitation learning (IL) to perform contact rich tasks.
We introduce two algorithmic contributions, tactile force matching and learned mode switching, as complimentary methods for improving IL.
Our results show that the inclusion of force matching raises average policy success rates by 62.5%, visuotactile mode switching by 30.3%, and visuotactile data as a policy input by 42.5%.
- Score: 14.492202828369127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contact-rich tasks continue to present a variety of challenges for robotic manipulation. In this work, we leverage a multimodal visuotactile sensor within the framework of imitation learning (IL) to perform contact rich tasks that involve relative motion (slipping/sliding) between the end-effector and object. We introduce two algorithmic contributions, tactile force matching and learned mode switching, as complimentary methods for improving IL. Tactile force matching enhances kinesthetic teaching by reading approximate forces during the demonstration and generating an adapted robot trajectory that recreates the recorded forces. Learned mode switching uses IL to couple visual and tactile sensor modes with the learned motion policy, simplifying the transition from reaching to contacting. We perform robotic manipulation experiments on four door opening tasks with a variety of observation and method configurations to study the utility of our proposed improvements and multimodal visuotactile sensing. Our results show that the inclusion of force matching raises average policy success rates by 62.5%, visuotactile mode switching by 30.3%, and visuotactile data as a policy input by 42.5%, emphasizing the value of see-through tactile sensing for IL, both for data collection to allow force matching, and for policy execution to allow accurate task feedback.
Related papers
- Learning Precise, Contact-Rich Manipulation through Uncalibrated Tactile Skins [17.412763585521688]
We present the Visuo-Skin (ViSk) framework, a simple approach that uses a transformer-based policy and treats skin sensor data as additional tokens alongside visual information.
ViSk significantly outperforms both vision-only and optical tactile sensing based policies.
Further analysis reveals that combining tactile and visual modalities enhances policy performance and spatial generalization, achieving an average improvement of 27.5% across tasks.
arXiv Detail & Related papers (2024-10-22T17:59:49Z) - Learning Manipulation by Predicting Interaction [85.57297574510507]
We propose a general pre-training pipeline that learns Manipulation by Predicting the Interaction.
The experimental results demonstrate that MPI exhibits remarkable improvement by 10% to 64% compared with previous state-of-the-art in real-world robot platforms.
arXiv Detail & Related papers (2024-06-01T13:28:31Z) - Learning Visuotactile Skills with Two Multifingered Hands [80.99370364907278]
We explore learning from human demonstrations using a bimanual system with multifingered hands and visuotactile data.
Our results mark a promising step forward in bimanual multifingered manipulation from visuotactile data.
arXiv Detail & Related papers (2024-04-25T17:59:41Z) - Multimodal Visual-Tactile Representation Learning through
Self-Supervised Contrastive Pre-Training [0.850206009406913]
MViTac is a novel methodology that leverages contrastive learning to integrate vision and touch sensations in a self-supervised fashion.
By availing both sensory inputs, MViTac leverages intra and inter-modality losses for learning representations, resulting in enhanced material property classification and more adept grasping prediction.
arXiv Detail & Related papers (2024-01-22T15:11:57Z) - The Power of the Senses: Generalizable Manipulation from Vision and
Touch through Masked Multimodal Learning [60.91637862768949]
We propose Masked Multimodal Learning (M3L) to fuse visual and tactile information in a reinforcement learning setting.
M3L learns a policy and visual-tactile representations based on masked autoencoding.
We evaluate M3L on three simulated environments with both visual and tactile observations.
arXiv Detail & Related papers (2023-11-02T01:33:00Z) - Tactile-Filter: Interactive Tactile Perception for Part Mating [54.46221808805662]
Humans rely on touch and tactile sensing for a lot of dexterous manipulation tasks.
vision-based tactile sensors are being widely used for various robotic perception and control tasks.
We present a method for interactive perception using vision-based tactile sensors for a part mating task.
arXiv Detail & Related papers (2023-03-10T16:27:37Z) - Multi-dataset Training of Transformers for Robust Action Recognition [75.5695991766902]
We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition.
Here, we propose a novel multi-dataset training paradigm, MultiTrain, with the design of two new loss terms, namely informative loss and projection loss.
We verify the effectiveness of our method on five challenging datasets, Kinetics-400, Kinetics-700, Moments-in-Time, Activitynet and Something-something-v2.
arXiv Detail & Related papers (2022-09-26T01:30:43Z) - Visual-Tactile Multimodality for Following Deformable Linear Objects
Using Reinforcement Learning [15.758583731036007]
We study the problem of using vision and tactile inputs together to complete the task of following deformable linear objects.
We create a Reinforcement Learning agent using different sensing modalities and investigate how its behaviour can be boosted.
Our experiments show that the use of both vision and tactile inputs, together with proprioception, allows the agent to complete the task in up to 92% of cases.
arXiv Detail & Related papers (2022-03-31T21:59:08Z) - TANDEM: Learning Joint Exploration and Decision Making with Tactile
Sensors [15.418884994244996]
We focus on the process of guiding tactile exploration, and its interplay with task-related decision making.
We propose TANDEM, an architecture to learn efficient exploration strategies in conjunction with decision making.
We demonstrate this method on a tactile object recognition task, where a robot equipped with a touch sensor must explore and identify an object from a known set based on tactile feedback alone.
arXiv Detail & Related papers (2022-03-01T23:55:09Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.