MoSS: Monocular Shape Sensing for Continuum Robots
- URL: http://arxiv.org/abs/2303.00891v2
- Date: Tue, 27 Jun 2023 22:40:45 GMT
- Title: MoSS: Monocular Shape Sensing for Continuum Robots
- Authors: Chengnan Shentu, Enxu Li, Chaojun Chen, Puspita Triana Dewi, David B.
Lindell, Jessica Burgner-Kahrs
- Abstract summary: This paper proposes the first eye-to-hand monocular approach to continuum robot shape sensing.
MoSSNet eliminates the cost of stereo matching and reduces requirements on sensing hardware.
A two-segment tendon-driven continuum robot is used for data collection and testing.
- Score: 11.377027568901038
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continuum robots are promising candidates for interactive tasks in medical
and industrial applications due to their unique shape, compliance, and
miniaturization capability. Accurate and real-time shape sensing is essential
for such tasks yet remains a challenge. Embedded shape sensing has high
hardware complexity and cost, while vision-based methods require stereo setup
and struggle to achieve real-time performance. This paper proposes the first
eye-to-hand monocular approach to continuum robot shape sensing. Utilizing a
deep encoder-decoder network, our method, MoSSNet, eliminates the computation
cost of stereo matching and reduces requirements on sensing hardware. In
particular, MoSSNet comprises an encoder and three parallel decoders to uncover
spatial, length, and contour information from a single RGB image, and then
obtains the 3D shape through curve fitting. A two-segment tendon-driven
continuum robot is used for data collection and testing, demonstrating accurate
(mean shape error of 0.91 mm, or 0.36% of robot length) and real-time (70 fps)
shape sensing on real-world data. Additionally, the method is optimized
end-to-end and does not require fiducial markers, manual segmentation, or
camera calibration. Code and datasets will be made available at
https://github.com/ContinuumRoboticsLab/MoSSNet.
Related papers
- Efficient Slice Anomaly Detection Network for 3D Brain MRI Volume [2.3633885460047765]
Current anomaly detection methods excel with benchmark industrial data but struggle with medical data due to varying definitions of 'normal' and 'abnormal'
We propose a framework called Simple Slice-based Network (SimpleSliceNet), which utilizes a model pre-trained on ImageNet and fine-tuned on a separate MRI dataset as a 2D slice feature extractor to reduce computational cost.
arXiv Detail & Related papers (2024-08-28T17:20:56Z) - EmbodiedSAM: Online Segment Any 3D Thing in Real Time [61.2321497708998]
Embodied tasks require the agent to fully understand 3D scenes simultaneously with its exploration.
An online, real-time, fine-grained and highly-generalized 3D perception model is desperately needed.
arXiv Detail & Related papers (2024-08-21T17:57:06Z) - DatasetNeRF: Efficient 3D-aware Data Factory with Generative Radiance Fields [68.94868475824575]
This paper introduces a novel approach capable of generating infinite, high-quality 3D-consistent 2D annotations alongside 3D point cloud segmentations.
We leverage the strong semantic prior within a 3D generative model to train a semantic decoder.
Once trained, the decoder efficiently generalizes across the latent space, enabling the generation of infinite data.
arXiv Detail & Related papers (2023-11-18T21:58:28Z) - HEDNet: A Hierarchical Encoder-Decoder Network for 3D Object Detection
in Point Clouds [19.1921315424192]
3D object detection in point clouds is important for autonomous driving systems.
A primary challenge in 3D object detection stems from the sparse distribution of points within the 3D scene.
We propose HEDNet, a hierarchical encoder-decoder network for 3D object detection.
arXiv Detail & Related papers (2023-10-31T07:32:08Z) - Joint-MAE: 2D-3D Joint Masked Autoencoders for 3D Point Cloud
Pre-training [65.75399500494343]
Masked Autoencoders (MAE) have shown promising performance in self-supervised learning for 2D and 3D computer vision.
We propose Joint-MAE, a 2D-3D joint MAE framework for self-supervised 3D point cloud pre-training.
arXiv Detail & Related papers (2023-02-27T17:56:18Z) - StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels
from a Stereo Camera Using Deep Neural Networks [32.7826524859756]
Obstacle detection is a safety-critical problem in robot navigation, where stereo matching is a popular vision-based approach.
This paper proposes a computationally efficient method that leverages a deep neural network to detect occupancy from stereo images directly.
Our approach detects obstacles accurately in the range of 32 meters and achieves better IoU (Intersection over Union) and CD (Chamfer Distance) scores with only 2% of the computation cost of the state-of-the-art stereo model.
arXiv Detail & Related papers (2022-09-18T03:32:38Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Where is my hand? Deep hand segmentation for visual self-recognition in
humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view.
We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z) - A data-set of piercing needle through deformable objects for Deep
Learning from Demonstrations [0.21096737598952847]
This paper presents a dataset of inserting/piercing a needle with two arms of da Vinci Research Kit in/through soft tissues.
We implement several deep RLfD architectures, including simple feed-forward CNNs and different Recurrent Convolutional Networks (RCNs)
Our study indicates RCNs improve the prediction accuracy of the model despite that the baseline feed-forward CNNs successfully learns the relationship between the visual information and the next step control actions of the robot.
arXiv Detail & Related papers (2020-12-04T08:27:06Z) - Event-based Robotic Grasping Detection with Neuromorphic Vision Sensor
and Event-Stream Dataset [8.030163836902299]
Neuromorphic vision is a small and young community of research. Compared to traditional frame-based computer vision, neuromorphic vision is a small and young community of research.
We construct a robotic grasping dataset named Event-Stream dataset with 91 objects.
As leds blink at high frequency, the Event-Stream dataset is annotated in a high frequency of 1 kHz.
We develop a deep neural network for grasping detection which consider the angle learning problem as classification instead of regression.
arXiv Detail & Related papers (2020-04-28T16:55:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.