Combining Vision and Tactile Sensation for Video Prediction
- URL: http://arxiv.org/abs/2304.11193v1
- Date: Fri, 21 Apr 2023 18:02:15 GMT
- Title: Combining Vision and Tactile Sensation for Video Prediction
- Authors: Willow Mandil and Amir Ghalamzan-E
- Abstract summary: We investigate the impact of integrating tactile feedback into video prediction models for physical robot interactions.
We introduce two new datasets of robot pushing that use a magnetic-based tactile sensor for unsupervised learning.
Our results demonstrate that incorporating tactile feedback into video prediction models improves scene prediction accuracy and enhances the agent's perception of physical interactions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we explore the impact of adding tactile sensation to video
prediction models for physical robot interactions. Predicting the impact of
robotic actions on the environment is a fundamental challenge in robotics.
Current methods leverage visual and robot action data to generate video
predictions over a given time period, which can then be used to adjust robot
actions. However, humans rely on both visual and tactile feedback to develop
and maintain a mental model of their physical surroundings. In this paper, we
investigate the impact of integrating tactile feedback into video prediction
models for physical robot interactions. We propose three multi-modal
integration approaches and compare the performance of these tactile-enhanced
video prediction models. Additionally, we introduce two new datasets of robot
pushing that use a magnetic-based tactile sensor for unsupervised learning. The
first dataset contains visually identical objects with different physical
properties, while the second dataset mimics existing robot-pushing datasets of
household object clusters. Our results demonstrate that incorporating tactile
feedback into video prediction models improves scene prediction accuracy and
enhances the agent's perception of physical interactions and understanding of
cause-effect relationships during physical robot interactions.
Related papers
- Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris.
Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models.
We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z) - RoboPack: Learning Tactile-Informed Dynamics Models for Dense Packing [38.97168020979433]
We introduce an approach that combines visual and tactile sensing for robotic manipulation by learning a neural, tactile-informed dynamics model.
Our proposed framework, RoboPack, employs a recurrent graph neural network to estimate object states.
We demonstrate our approach on a real robot equipped with a compliant Soft-Bubble tactile sensor on non-prehensile manipulation and dense packing tasks.
arXiv Detail & Related papers (2024-07-01T16:08:37Z) - Predicting Human Impressions of Robot Performance During Navigation Tasks [8.01980632893357]
We investigate the possibility of predicting people's impressions of robot behavior using non-verbal behavioral cues and machine learning techniques.
Results suggest that facial expressions alone provide useful information about human impressions of robot performance.
Supervised learning techniques showed promise because they outperformed humans' predictions of robot performance in most cases.
arXiv Detail & Related papers (2023-10-17T21:12:32Z) - Action-conditioned Deep Visual Prediction with RoAM, a new Indoor Human
Motion Dataset for Autonomous Robots [1.7778609937758327]
We introduce the Robot Autonomous Motion (RoAM) video dataset.
It is collected with a custom-made turtlebot3 Burger robot in a variety of indoor environments recording various human motions from the robot's ego-vision.
The dataset also includes synchronized records of the LiDAR scan and all control actions taken by the robot as it navigates around static and moving human agents.
arXiv Detail & Related papers (2023-06-28T00:58:44Z) - Robot Learning with Sensorimotor Pre-training [98.7755895548928]
We present a self-supervised sensorimotor pre-training approach for robotics.
Our model, called RPT, is a Transformer that operates on sequences of sensorimotor tokens.
We find that sensorimotor pre-training consistently outperforms training from scratch, has favorable scaling properties, and enables transfer across different tasks, environments, and robots.
arXiv Detail & Related papers (2023-06-16T17:58:10Z) - Tactile-Filter: Interactive Tactile Perception for Part Mating [54.46221808805662]
Humans rely on touch and tactile sensing for a lot of dexterous manipulation tasks.
vision-based tactile sensors are being widely used for various robotic perception and control tasks.
We present a method for interactive perception using vision-based tactile sensors for a part mating task.
arXiv Detail & Related papers (2023-03-10T16:27:37Z) - Dynamic Modeling of Hand-Object Interactions via Tactile Sensing [133.52375730875696]
In this work, we employ a high-resolution tactile glove to perform four different interactive activities on a diversified set of objects.
We build our model on a cross-modal learning framework and generate the labels using a visual processing pipeline to supervise the tactile model.
This work takes a step on dynamics modeling in hand-object interactions from dense tactile sensing.
arXiv Detail & Related papers (2021-09-09T16:04:14Z) - Future Frame Prediction for Robot-assisted Surgery [57.18185972461453]
We propose a ternary prior guided variational autoencoder (TPG-VAE) model for future frame prediction in robotic surgical video sequences.
Besides content distribution, our model learns motion distribution, which is novel to handle the small movements of surgical tools.
arXiv Detail & Related papers (2021-03-18T15:12:06Z) - Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works.
However, learning a model that captures the dynamics of complex skills represents a major challenge.
We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.