Related papers: Achieving RGB-D level Segmentation Performance from a Single ToF Camera

Achieving RGB-D level Segmentation Performance from a Single ToF Camera

URL: http://arxiv.org/abs/2306.17636v1
Date: Fri, 30 Jun 2023 13:14:27 GMT
Title: Achieving RGB-D level Segmentation Performance from a Single ToF Camera
Authors: Pranav Sharma, Jigyasa Singh Katrolia, Jason Rambach, Bruno Mirbach, Didier Stricker, Juergen Seiler
Abstract summary: We show that it is possible to obtain the same level of accuracy as RGB-D cameras on a semantic segmentation task using infrared (IR) and depth images from a single Time-of-Flight (ToF) camera.
Score: 9.99197786343155
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Depth is a very important modality in computer vision, typically used as complementary information to RGB, provided by RGB-D cameras. In this work, we show that it is possible to obtain the same level of accuracy as RGB-D cameras on a semantic segmentation task using infrared (IR) and depth images from a single Time-of-Flight (ToF) camera. In order to fuse the IR and depth modalities of the ToF camera, we introduce a method utilizing depth-specific convolutions in a multi-task learning framework. In our evaluation on an in-car segmentation dataset, we demonstrate the competitiveness of our method against the more costly RGB-D approaches.

Related papers

RGB-D Video Object Segmentation via Enhanced Multi-store Feature Memory [34.406308400305385]
RGB-Depth (RGB-D) Video Object (VOS) aims to integrate the fine-grained texture information of RGB with the geometric clues of depth modality. In this paper, we propose a novel RGB-D VOS via multi-store feature memory for robust segmentation. We show that the proposed method state-of-the-art performance on the latest RGB-D VOS benchmark.
arXiv Detail & Related papers (2025-04-23T07:31:37Z)
RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation [4.625376287612609]
We propose a novel diffusion-based method for generating RAW images guided by RGB images. This approach yields high-fidelity RAW images, enabling the creation of camera-specific RAW datasets. We extend our method to create BDD100K-RAW and Cityscapes-RAW datasets, revealing its effectiveness for object detection in RAW imagery.
arXiv Detail & Related papers (2024-11-20T09:40:12Z)
Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer [10.982521876026281]
We introduce a diffusion-based framework to address the RGB-D semantic segmentation problem. We demonstrate that utilizing a Deformable Attention Transformer as the encoder to extract features from depth images effectively captures the characteristics of invalid regions in depth measurements.
arXiv Detail & Related papers (2024-09-23T15:23:01Z)
Depth-based Privileged Information for Boosting 3D Human Pose Estimation on RGB [48.31210455404533]
Heatmap-based 3D pose estimator is able to hallucinate depth information from the RGB frames given at inference time. depth information is used exclusively during training by enforcing our RGB-based hallucination network to learn similar features to a backbone pre-trained only on depth data.
arXiv Detail & Related papers (2024-09-17T11:59:34Z)
RGB Guided ToF Imaging System: A Survey of Deep Learning-based Methods [30.34690112905212]
Integrating an RGB camera into a ToF imaging system has become a significant technique for perceiving the real world. This paper comprehensively reviews the works related to RGB guided ToF imaging, including network structures, learning strategies, evaluation metrics, benchmark datasets, and objective functions.
arXiv Detail & Related papers (2024-05-16T17:59:58Z)
Complementing Event Streams and RGB Frames for Hand Mesh Reconstruction [51.87279764576998]
We propose EvRGBHand -- the first approach for 3D hand mesh reconstruction with an event camera and an RGB camera compensating for each other. EvRGBHand can tackle overexposure and motion blur issues in RGB-based HMR and foreground scarcity and background overflow issues in event-based HMR.
arXiv Detail & Related papers (2024-03-12T06:04:50Z)
AGG-Net: Attention Guided Gated-convolutional Network for Depth Image Completion [1.8820731605557168]
We propose a new model for depth image completion based on the Attention Guided Gated-convolutional Network (AGG-Net) In the encoding stage, an Attention Guided Gated-Convolution (AG-GConv) module is proposed to realize the fusion of depth and color features at different scales. In the decoding stage, an Attention Guided Skip Connection (AG-SC) module is presented to avoid introducing too many depth-irrelevant features to the reconstruction.
arXiv Detail & Related papers (2023-09-04T14:16:08Z)
FloatingFusion: Depth from ToF and Image-stabilized Stereo Cameras [37.812681878193914]
smartphones now have multimodal camera systems with time-of-flight (ToF) depth sensors and multiple color cameras. producing accurate high-resolution depth is still challenging due to the low resolution and limited active illumination power of ToF sensors. We propose an automatic calibration technique based on dense 2D/3D matching that can estimate camera parameters from a single snapshot.
arXiv Detail & Related papers (2022-10-06T09:57:09Z)
Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images [89.81919625224103]
Training deep models for RGB-D salient object detection (SOD) often requires a large number of labeled RGB-D images. We present a Dual-Semi RGB-D Salient Object Detection Network (DS-Net) to leverage unlabeled RGB images for boosting RGB-D saliency detection.
arXiv Detail & Related papers (2022-01-01T03:02:27Z)
Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation [59.94819184452694]
Depth information has proven to be a useful cue in the semantic segmentation of RGBD images for providing a geometric counterpart to the RGB representation. Most existing works simply assume that depth measurements are accurate and well-aligned with the RGB pixels and models the problem as a cross-modal feature fusion. In this paper, we propose a unified and efficient Crossmodality Guided to not only effectively recalibrate RGB feature responses, but also to distill accurate depth information via multiple stages and aggregate the two recalibrated representations alternatively.
arXiv Detail & Related papers (2020-07-17T18:35:24Z)
Synergistic saliency and depth prediction for RGB-D saliency detection [76.27406945671379]
Existing RGB-D saliency datasets are small, which may lead to overfitting and limited generalization for diverse scenarios. We propose a semi-supervised system for RGB-D saliency detection that can be trained on smaller RGB-D saliency datasets without saliency ground truth.
arXiv Detail & Related papers (2020-07-03T14:24:41Z)
Is Depth Really Necessary for Salient Object Detection? [50.10888549190576]
We make the first attempt in realizing an unified depth-aware framework with only RGB information as input for inference. Not only surpasses the state-of-the-art performances on five public RGB SOD benchmarks, but also surpasses the RGBD-based methods on five benchmarks by a large margin.
arXiv Detail & Related papers (2020-05-30T13:40:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.