Monocular Reconstruction of Neural Tactile Fields
- URL: http://arxiv.org/abs/2602.12508v1
- Date: Fri, 13 Feb 2026 01:25:19 GMT
- Title: Monocular Reconstruction of Neural Tactile Fields
- Authors: Pavan Mantripragada, Siddhanth Deshmukh, Eadom Dessalene, Manas Desai, Yiannis Aloimonos,
- Abstract summary: We introduce neural tactile fields, a novel 3D representation that maps spatial locations to the expected tactile response upon contact.<n>Our model predicts these neural tactile fields from a single monocular RGB image -- the first method to do so.<n>When integrated with off-the-shelf path planners, neural tactile fields enable robots to generate paths that avoid high-resistance objects.
- Score: 14.002981599280787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robots operating in the real world must plan through environments that deform, yield, and reconfigure under contact, requiring interaction-aware 3D representations that extend beyond static geometric occupancy. To address this, we introduce neural tactile fields, a novel 3D representation that maps spatial locations to the expected tactile response upon contact. Our model predicts these neural tactile fields from a single monocular RGB image -- the first method to do so. When integrated with off-the-shelf path planners, neural tactile fields enable robots to generate paths that avoid high-resistance objects while deliberately routing through low-resistance regions (e.g. foliage), rather than treating all occupied space as equally impassable. Empirically, our learning framework improves volumetric 3D reconstruction by $85.8\%$ and surface reconstruction by $26.7\%$ compared to state-of-the-art monocular 3D reconstruction methods (LRM and Direct3D).
Related papers
- MeshMimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction [54.36564144414704]
MeshMimic is an innovative framework that bridges 3D scene reconstruction and embodied intelligence to enable humanoid robots to learn coupled "motion-terrain" interactions directly from video.<n>By leveraging state-of-the-art 3D vision models, our framework precisely segments and reconstructs both human trajectories and the underlying 3D geometry of terrains and objects.
arXiv Detail & Related papers (2026-02-17T17:09:45Z) - Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation [54.04601077224252]
Embodied scene understanding requires not only comprehending visual-spatial information but also determining where to explore next in the 3D physical world.<n>underlinetextbf3D vision-language learning enables embodied agents to effectively explore and understand their environment.<n>model's versatility enables navigation using diverse input modalities, including categories, language descriptions, and reference images.
arXiv Detail & Related papers (2025-07-05T14:15:52Z) - MonoGSDF: Exploring Monocular Geometric Cues for Gaussian Splatting-Guided Implicit Surface Reconstruction [86.87464903285208]
We introduce MonoGSDF, a novel method that couples primitives with a neural Signed Distance Field (SDF) for high-quality reconstruction.<n>To handle arbitrary-scale scenes, we propose a scaling strategy for robust generalization.<n>Experiments on real-world datasets outperforms prior methods while maintaining efficiency.
arXiv Detail & Related papers (2024-11-25T20:07:07Z) - SMORE: Simultaneous Map and Object REconstruction [66.66729715211642]
We present a method for dynamic surface reconstruction of large-scale urban scenes from LiDAR.<n>We take a holistic perspective and optimize a compositional model of a dynamic scene that decomposes the world into rigidly-moving objects and the background.
arXiv Detail & Related papers (2024-06-19T23:53:31Z) - TouchSDF: A DeepSDF Approach for 3D Shape Reconstruction using
Vision-Based Tactile Sensing [29.691786688595762]
Humans rely on their visual and tactile senses to develop a comprehensive 3D understanding of their physical environment.
We propose TouchSDF, a Deep Learning approach for tactile 3D shape reconstruction.
Our technique consists of two components: (1) a Convolutional Neural Network that maps tactile images into local meshes representing the surface at the touch location, and (2) an implicit neural function that predicts a signed distance function to extract the desired 3D shape.
arXiv Detail & Related papers (2023-11-21T13:43:06Z) - Neural Poisson: Indicator Functions for Neural Fields [25.41908065938424]
Implicit neural field generating signed distance field representations (SDFs) of 3D shapes have shown remarkable progress.
We introduce a new paradigm for neural field representations of 3D scenes.
We show that our approach demonstrates state-of-the-art reconstruction performance on both synthetic and real scanned 3D scene data.
arXiv Detail & Related papers (2022-11-25T17:28:22Z) - OmniNeRF: Hybriding Omnidirectional Distance and Radiance fields for
Neural Surface Reconstruction [22.994952933576684]
Ground-breaking research in the neural radiance field (NeRF) has dramatically improved the representation quality of 3D objects.
Some later studies improved NeRF by building truncated signed distance fields (TSDFs) but still suffer from the problem of blurred surfaces in 3D reconstruction.
In this work, this surface ambiguity is addressed by proposing a novel way of 3D shape representation, OmniNeRF.
arXiv Detail & Related papers (2022-09-27T14:39:23Z) - Uncertainty Guided Policy for Active Robotic 3D Reconstruction using
Neural Radiance Fields [82.21033337949757]
This paper introduces a ray-based volumetric uncertainty estimator, which computes the entropy of the weight distribution of the color samples along each ray of the object's implicit neural representation.
We show that it is possible to infer the uncertainty of the underlying 3D geometry given a novel view with the proposed estimator.
We present a next-best-view selection policy guided by the ray-based volumetric uncertainty in neural radiance fields-based representations.
arXiv Detail & Related papers (2022-09-17T21:28:57Z) - Active 3D Shape Reconstruction from Vision and Touch [66.08432412497443]
Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch.
In 3D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings.
We introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile priors to guide the shape exploration; and 3) a set of data-driven solutions with either tactile or visuo
arXiv Detail & Related papers (2021-07-20T15:56:52Z) - Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model
Alignments [81.38641691636847]
We rethink the problem of scene reconstruction from an embodied agent's perspective.
We reconstruct an interactive scene using RGB-D data stream.
This reconstructed scene replaces the object meshes in the dense panoptic map with part-based articulated CAD models.
arXiv Detail & Related papers (2021-03-30T05:56:58Z) - Amodal 3D Reconstruction for Robotic Manipulation via Stability and
Connectivity [3.359622001455893]
Learning-based 3D object reconstruction enables single- or few-shot estimation of 3D object models.
Existing 3D reconstruction techniques optimize for visual reconstruction fidelity, typically measured by chamfer distance or voxel IOU.
We propose ARM, an amodal 3D reconstruction system that introduces a stability prior over object shapes, (2) a connectivity prior, and (3) a multi-channel input representation.
arXiv Detail & Related papers (2020-09-28T08:52:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.