RA-Touch: Retrieval-Augmented Touch Understanding with Enriched Visual Data
- URL: http://arxiv.org/abs/2505.14270v1
- Date: Tue, 20 May 2025 12:23:21 GMT
- Title: RA-Touch: Retrieval-Augmented Touch Understanding with Enriched Visual Data
- Authors: Yoorhim Cho, Hongyeob Kim, Semin Kim, Youjia Zhang, Yunseok Choi, Sungeun Hong,
- Abstract summary: Visuo-tactile perception aims to understand an object's tactile properties, such as texture, softness, and rigidity.<n>We introduce RA-Touch, a retrieval-augmented framework that improves visuo-tactile perception by leveraging visual data enriched with tactile semantics.
- Score: 10.059624183053499
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visuo-tactile perception aims to understand an object's tactile properties, such as texture, softness, and rigidity. However, the field remains underexplored because collecting tactile data is costly and labor-intensive. We observe that visually distinct objects can exhibit similar surface textures or material properties. For example, a leather sofa and a leather jacket have different appearances but share similar tactile properties. This implies that tactile understanding can be guided by material cues in visual data, even without direct tactile supervision. In this paper, we introduce RA-Touch, a retrieval-augmented framework that improves visuo-tactile perception by leveraging visual data enriched with tactile semantics. We carefully recaption a large-scale visual dataset with tactile-focused descriptions, enabling the model to access tactile semantics typically absent from conventional visual datasets. A key challenge remains in effectively utilizing these tactile-aware external descriptions. RA-Touch addresses this by retrieving visual-textual representations aligned with tactile inputs and integrating them to focus on relevant textural and material properties. By outperforming prior methods on the TVL benchmark, our method demonstrates the potential of retrieval-based visual reuse for tactile understanding. Code is available at https://aim-skku.github.io/RA-Touch
Related papers
- RETRO: REthinking Tactile Representation Learning with Material PriOrs [4.938177645099319]
We introduce material-aware priors into the tactile representation learning process.<n>These priors represent pre-learned characteristics specific to different materials, allowing models to better capture and generalize the nuances of surface texture.<n>Our method enables more accurate, contextually rich tactile feedback across diverse materials and textures, improving performance in real-world applications such as robotics, haptic feedback systems, and material editing.
arXiv Detail & Related papers (2025-05-20T13:06:19Z) - Temporal Binding Foundation Model for Material Property Recognition via Tactile Sequence Perception [2.3724852180691025]
This letter presents a novel approach leveraging a temporal binding foundation model for tactile sequence understanding.<n>The proposed system captures the sequential nature of tactile interactions, similar to human fingertip perception.
arXiv Detail & Related papers (2025-01-24T21:47:38Z) - Controllable Visual-Tactile Synthesis [28.03469909285511]
We develop a conditional generative model that synthesizes both visual and tactile outputs from a single sketch.
We then introduce a pipeline to render high-quality visual and tactile outputs on an electroadhesion-based haptic device.
arXiv Detail & Related papers (2023-05-04T17:59:51Z) - Tactile-Filter: Interactive Tactile Perception for Part Mating [54.46221808805662]
Humans rely on touch and tactile sensing for a lot of dexterous manipulation tasks.
vision-based tactile sensors are being widely used for various robotic perception and control tasks.
We present a method for interactive perception using vision-based tactile sensors for a part mating task.
arXiv Detail & Related papers (2023-03-10T16:27:37Z) - Touch and Go: Learning from Human-Collected Vision and Touch [16.139106833276]
We propose a dataset with paired visual and tactile data called Touch and Go.
Human data collectors probe objects in natural environments using tactile sensors.
Our dataset spans a large number of "in the wild" objects and scenes.
arXiv Detail & Related papers (2022-11-22T18:59:32Z) - ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and
Tactile Representations [52.226947570070784]
We present Object, a dataset of 100 objects that addresses both challenges with two key innovations.
First, Object encodes the visual, auditory, and tactile sensory data for all objects, enabling a number of multisensory object recognition tasks.
Second, Object employs a uniform, object-centric simulations, and implicit representation for each object's visual textures, tactile readings, and tactile readings, making the dataset flexible to use and easy to share.
arXiv Detail & Related papers (2021-09-16T14:00:59Z) - Dynamic Modeling of Hand-Object Interactions via Tactile Sensing [133.52375730875696]
In this work, we employ a high-resolution tactile glove to perform four different interactive activities on a diversified set of objects.
We build our model on a cross-modal learning framework and generate the labels using a visual processing pipeline to supervise the tactile model.
This work takes a step on dynamics modeling in hand-object interactions from dense tactile sensing.
arXiv Detail & Related papers (2021-09-09T16:04:14Z) - Elastic Tactile Simulation Towards Tactile-Visual Perception [58.44106915440858]
We propose Elastic Interaction of Particles (EIP) for tactile simulation.
EIP models the tactile sensor as a group of coordinated particles, and the elastic property is applied to regulate the deformation of particles during contact.
We further propose a tactile-visual perception network that enables information fusion between tactile data and visual images.
arXiv Detail & Related papers (2021-08-11T03:49:59Z) - Active 3D Shape Reconstruction from Vision and Touch [66.08432412497443]
Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch.
In 3D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings.
We introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile priors to guide the shape exploration; and 3) a set of data-driven solutions with either tactile or visuo
arXiv Detail & Related papers (2021-07-20T15:56:52Z) - PyTouch: A Machine Learning Library for Touch Processing [68.32055581488557]
We present PyTouch, the first machine learning library dedicated to the processing of touch sensing signals.
PyTouch is designed to be modular, easy-to-use and provides state-of-the-art touch processing capabilities as a service.
We evaluate PyTouch on real-world data from several tactile sensors on touch processing tasks such as touch detection, slip and object pose estimations.
arXiv Detail & Related papers (2021-05-26T18:55:18Z) - Teaching Cameras to Feel: Estimating Tactile Physical Properties of
Surfaces From Images [4.666400601228301]
We introduce the challenging task of estimating a set of tactile physical properties from visual information.
We construct a first of its kind image-tactile dataset with over 400 multiview image sequences and the corresponding tactile properties.
We develop a cross-modal framework comprised of an adversarial objective and a novel visuo-tactile joint classification loss.
arXiv Detail & Related papers (2020-04-29T21:27:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.