Related papers: SemanticFeels: Semantic Labeling during In-Hand Manipulation

SemanticFeels: Semantic Labeling during In-Hand Manipulation

URL: http://arxiv.org/abs/2602.14099v1
Date: Sun, 15 Feb 2026 11:22:05 GMT
Title: SemanticFeels: Semantic Labeling during In-Hand Manipulation
Authors: Anas Al Shikh Khalil, Haozhi Qi, Roberto Calandra,
Abstract summary: We present SemanticFeels, an extension of the NeuralFeels framework that integrates semantic labeling with neural implicit shape representation.<n>We show that the system achieves a high correspondence between predicted and actual materials on both single- and multi-material objects.
Score: 4.054377831053792
License: http://creativecommons.org/licenses/by/4.0/
Abstract: As robots become increasingly integrated into everyday tasks, their ability to perceive both the shape and properties of objects during in-hand manipulation becomes critical for adaptive and intelligent behavior. We present SemanticFeels, an extension of the NeuralFeels framework that integrates semantic labeling with neural implicit shape representation, from vision and touch. To illustrate its application, we focus on material classification: high-resolution Digit tactile readings are processed by a fine-tuned EfficientNet-B0 convolutional neural network (CNN) to generate local material predictions, which are then embedded into an augmented signed distance field (SDF) network that jointly predicts geometry and continuous material regions. Experimental results show that the system achieves a high correspondence between predicted and actual materials on both single- and multi-material objects, with an average matching accuracy of 79.87% across multiple manipulation trials on a multi-material object.

Related papers

Noise-Aware Training of Neuromorphic Dynamic Device Networks [2.2691986670431197]
We propose a novel, noise-aware methodology for training device networks. Our approach employs backpropagation through time and cascade learning, allowing networks to effectively exploit the temporal properties of physical devices.
arXiv Detail & Related papers (2024-01-14T22:46:53Z)
Bayesian and Neural Inference on LSTM-based Object Recognition from Tactile and Kinesthetic Information [0.0]
Haptic perception encompasses the sensing modalities encountered in the sense of touch (e.g., tactile and kinesthetic sensations) This letter focuses on multimodal object recognition and proposes analytical and data-driven methodologies to fuse tactile- and kinesthetic-based classification results.
arXiv Detail & Related papers (2023-06-10T12:29:23Z)
Learn to Predict How Humans Manipulate Large-sized Objects from Interactive Motions [82.90906153293585]
We propose a graph neural network, HO-GCN, to fuse motion data and dynamic descriptors for the prediction task. We show the proposed network that consumes dynamic descriptors can achieve state-of-the-art prediction results and help the network better generalize to unseen objects.
arXiv Detail & Related papers (2022-06-25T09:55:39Z)
A Variational Graph Autoencoder for Manipulation Action Recognition and Prediction [1.1816942730023883]
We introduce a deep graph autoencoder to jointly learn recognition and prediction of manipulation tasks from symbolic scene graphs. Our network has a variational autoencoder structure with two branches: one for identifying the input graph type and one for predicting the future graphs. We benchmark our new model against different state-of-the-art methods on two different datasets, MANIAC and MSRC-9, and show that our proposed model can achieve better performance.
arXiv Detail & Related papers (2021-10-25T21:40:42Z)
A Framework for Multisensory Foresight for Embodied Agents [11.351546861334292]
Predicting future sensory states is crucial for learning agents such as robots, drones, and autonomous vehicles. In this paper, we couple multiple sensory modalities with exploratory actions and propose a predictive neural network architecture to address this problem. The framework was tested and validated with a dataset containing 4 sensory modalities (vision, haptic, audio, and tactile) on a humanoid robot performing 9 behaviors multiple times on a large set of objects.
arXiv Detail & Related papers (2021-09-15T20:20:04Z)
Dynamic Modeling of Hand-Object Interactions via Tactile Sensing [133.52375730875696]
In this work, we employ a high-resolution tactile glove to perform four different interactive activities on a diversified set of objects. We build our model on a cross-modal learning framework and generate the labels using a visual processing pipeline to supervise the tactile model. This work takes a step on dynamics modeling in hand-object interactions from dense tactile sensing.
arXiv Detail & Related papers (2021-09-09T16:04:14Z)
Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes. Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z)
TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks. To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame. Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z)
Learning Intuitive Physics with Multimodal Generative Models [24.342994226226786]
This paper presents a perception framework that fuses visual and tactile feedback to make predictions about the expected motion of objects in dynamic scenes. We use a novel See-Through-your-Skin (STS) sensor that provides high resolution multimodal sensing of contact surfaces. We validate through simulated and real-world experiments in which the resting state of an object is predicted from given initial conditions.
arXiv Detail & Related papers (2021-01-12T12:55:53Z)
Continuous Emotion Recognition with Spatiotemporal Convolutional Neural Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild. We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z)
Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos) The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z)
TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object Recognition [17.37142241982902]
New advances in flexible, event-driven, electronic skins may soon endow robots with touch perception capabilities similar to humans. These unique features may render current deep learning approaches such as convolutional feature extractors unsuitable for tactile learning. We propose a novel spiking graph neural network for event-based tactile object recognition.
arXiv Detail & Related papers (2020-08-01T03:35:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.