Related papers: High-speed object detection with a single-photon time-of-flight image sensor

High-speed object detection with a single-photon time-of-flight image sensor

URL: http://arxiv.org/abs/2107.13407v1
Date: Wed, 28 Jul 2021 14:53:44 GMT
Title: High-speed object detection with a single-photon time-of-flight image sensor
Authors: Germ\'an Mora-Mart\'in, Alex Turpin, Alice Ruget, Abderrahim Halimi, Robert Henderson, Jonathan Leach and Istvan Gyongy
Abstract summary: We present results from a portable SPAD camera system that outputs 16-bin photon timing histograms with 64x32 spatial resolution. The results are relevant for safety-critical computer vision applications which would benefit from faster than human reaction times.
Score: 2.648554238948439
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D time-of-flight (ToF) imaging is used in a variety of applications such as augmented reality (AR), computer interfaces, robotics and autonomous systems. Single-photon avalanche diodes (SPADs) are one of the enabling technologies providing accurate depth data even over long ranges. By developing SPADs in array format with integrated processing combined with pulsed, flood-type illumination, high-speed 3D capture is possible. However, array sizes tend to be relatively small, limiting the lateral resolution of the resulting depth maps, and, consequently, the information that can be extracted from the image for applications such as object detection. In this paper, we demonstrate that these limitations can be overcome through the use of convolutional neural networks (CNNs) for high-performance object detection. We present outdoor results from a portable SPAD camera system that outputs 16-bin photon timing histograms with 64x32 spatial resolution. The results, obtained with exposure times down to 2 ms (equivalent to 500 FPS) and in signal-to-background (SBR) ratios as low as 0.05, point to the advantages of providing the CNN with full histogram data rather than point clouds alone. Alternatively, a combination of point cloud and active intensity data may be used as input, for a similar level of performance. In either case, the GPU-accelerated processing time is less than 1 ms per frame, leading to an overall latency (image acquisition plus processing) in the millisecond range, making the results relevant for safety-critical computer vision applications which would benefit from faster than human reaction times.

Related papers

DERD-Net: Learning Depth from Event-based Ray Densities [11.309936820480111]
Event cameras offer a promising avenue for multi-view stereo depth estimation and SLAM. We propose a scalable, flexible and adaptable framework for pixel-wise depth estimation with event cameras in both monocular and stereo setups.
arXiv Detail & Related papers (2025-04-22T12:58:05Z)
A Plug-and-Play Algorithm for 3D Video Super-Resolution of Single-Photon LiDAR data [5.378429123269604]
Single-photon avalanche diodes (SPADs) are advanced sensors capable of detecting individual photons and recording their arrival times with picosecond resolution. We propose a novel computational imaging algorithm to improve the 3D reconstruction of moving scenes from SPAD data.
arXiv Detail & Related papers (2024-12-12T16:33:06Z)
FoveaSPAD: Exploiting Depth Priors for Adaptive and Efficient Single-Photon 3D Imaging [7.350208716861244]
Single-photon avalanche diodes (SPADs) are an emerging image-sensing technology that offer many advantages such as extreme sensitivity and time resolution. In this paper, we propose new algorithms and sensing policies that improve signal-to-noise ratio (SNR) and increase computing and memory efficiency.
arXiv Detail & Related papers (2024-12-03T00:20:01Z)
bit2bit: 1-bit quanta video reconstruction via self-supervised photon prediction [57.199618102578576]
We propose bit2bit, a new method for reconstructing high-quality image stacks at original resolution from sparse binary quantatemporal image data. Inspired by recent work on Poisson denoising, we developed an algorithm that creates a dense image sequence from sparse binary photon data. We present a novel dataset containing a wide range of real SPAD high-speed videos under various challenging imaging conditions.
arXiv Detail & Related papers (2024-10-30T17:30:35Z)
Single-Photon 3D Imaging with Equi-Depth Photon Histograms [4.432168053497992]
Single-photon 3D cameras estimate the round-trip time of a laser pulse by forming equi-width (EW) histograms of detected photon timestamps. EW histograms require high bandwidth and in-pixel memory, making SPCs less attractive in resource-constrained settings. We propose a 3D sensing technique based on equi-depth (ED) histograms.
arXiv Detail & Related papers (2024-08-28T22:02:38Z)
Shakes on a Plane: Unsupervised Depth Estimation from Unstabilized Photography [54.36608424943729]
We show that in a ''long-burst'', forty-two 12-megapixel RAW frames captured in a two-second sequence, there is enough parallax information from natural hand tremor alone to recover high-quality scene depth. We devise a test-time optimization approach that fits a neural RGB-D representation to long-burst data and simultaneously estimates scene depth and camera motion.
arXiv Detail & Related papers (2022-12-22T18:54:34Z)
Video super-resolution for single-photon LIDAR [0.0]
3D Time-of-Flight (ToF) image sensors are used widely in applications such as self-driving cars, Augmented Reality (AR) and robotics. In this paper, we use synthetic depth sequences to train a 3D Convolutional Neural Network (CNN) for denoising and upscaling (x4) depth data. With GPU acceleration, frames are processed at >30 frames per second, making the approach suitable for low-latency imaging, as required for obstacle avoidance.
arXiv Detail & Related papers (2022-10-19T11:33:29Z)
A direct time-of-flight image sensor with in-pixel surface detection and dynamic vision [0.0]
3D flash LIDAR is an alternative to the traditional scanning LIDAR systems, promising precise depth imaging in a compact form factor. We present a 64x32 pixel (256x128 SPAD) dToF imager that overcomes these limitations by using pixels with embedded histogramming. This reduces the size of output data frames considerably, enabling maximum frame rates in the 10 kFPS range or 100 kFPS for direct depth readings.
arXiv Detail & Related papers (2022-09-23T14:38:00Z)
Single-Photon Structured Light [31.614032717665832]
"Single-Photon Structured Light" works by sensing binary images that indicates the presence or absence of photon arrivals during each exposure. We develop novel temporal sequences using error correction codes that are designed to be robust to short-range effects like projector and camera defocus. Our lab prototype is capable of 3D imaging in challenging scenarios involving objects with extremely low albedo or undergoing fast motion.
arXiv Detail & Related papers (2022-04-11T17:57:04Z)
SALISA: Saliency-based Input Sampling for Efficient Video Object Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection. We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z)
VPFNet: Improving 3D Object Detection with Virtual Point based LiDAR and Stereo Data Fusion [62.24001258298076]
VPFNet is a new architecture that cleverly aligns and aggregates the point cloud and image data at the virtual' points. Our VPFNet achieves 83.21% moderate 3D AP and 91.86% moderate BEV AP on the KITTI test set, ranking the 1st since May 21th, 2021.
arXiv Detail & Related papers (2021-11-29T08:51:20Z)
Expandable YOLO: 3D Object Detection from RGB-D Images [64.14512458954344]
This paper aims at constructing a light-weight object detector that inputs a depth and a color image from a stereo camera. By extending the network architecture of YOLOv3 to 3D in the middle, it is possible to output in the depth direction. Intersection over Uninon (IoU) in 3D space is introduced to confirm the accuracy of region extraction results.
arXiv Detail & Related papers (2020-06-26T07:32:30Z)
Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes [98.65457534223539]
We propose a real-time high-performance DCNN-based method for robust semantic segmentation of urban street scenes. The proposed method achieves the accuracy of 73.6% and 68.0% mean Intersection over Union (mIoU) with the inference speed of 51.0 fps and 39.3 fps.
arXiv Detail & Related papers (2020-03-11T08:45:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.