DPDnet: A Robust People Detector using Deep Learning with an Overhead
Depth Camera
- URL: http://arxiv.org/abs/2006.01053v1
- Date: Mon, 1 Jun 2020 16:28:25 GMT
- Title: DPDnet: A Robust People Detector using Deep Learning with an Overhead
Depth Camera
- Authors: David Fuentes-Jimenez, Roberto Martin-Lopez, Cristina
Losada-Gutierrez, David Casillas-Perez, Javier Macias-Guarasa, Daniel
Pizarro, Carlos A.Luna
- Abstract summary: We propose a method that detects multiple people from a single overhead depth image with high reliability.
Our neural network, called DPDnet, is based on two fully-convolutional encoder-decoder neural blocks based on residual layers.
The experimental work shows that DPDNet outperforms state-of-the-art methods, with accuracies greater than 99% in three different publicly available datasets.
- Score: 9.376814409561726
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper we propose a method based on deep learning that detects
multiple people from a single overhead depth image with high reliability. Our
neural network, called DPDnet, is based on two fully-convolutional
encoder-decoder neural blocks based on residual layers. The Main Block takes a
depth image as input and generates a pixel-wise confidence map, where each
detected person in the image is represented by a Gaussian-like distribution.
The refinement block combines the depth image and the output from the main
block, to refine the confidence map. Both blocks are simultaneously trained
end-to-end using depth images and head position labels. The experimental work
shows that DPDNet outperforms state-of-the-art methods, with accuracies greater
than 99% in three different publicly available datasets, without retraining not
fine-tuning. In addition, the computational complexity of our proposal is
independent of the number of people in the scene and runs in real time using
conventional GPUs.
Related papers
- Pixel-Aligned Multi-View Generation with Depth Guided Decoder [86.1813201212539]
We propose a novel method for pixel-level image-to-multi-view generation.
Unlike prior work, we incorporate attention layers across multi-view images in the VAE decoder of a latent video diffusion model.
Our model enables better pixel alignment across multi-view images.
arXiv Detail & Related papers (2024-08-26T04:56:41Z) - Depth Is All You Need for Monocular 3D Detection [29.403235118234747]
We propose to align depth representation with the target domain in unsupervised fashions.
Our methods leverage commonly available LiDAR or RGB videos during training time to fine-tune the depth representation, which leads to improved 3D detectors.
arXiv Detail & Related papers (2022-10-05T18:12:30Z) - P3Depth: Monocular Depth Estimation with a Piecewise Planarity Prior [133.76192155312182]
We propose a method that learns to selectively leverage information from coplanar pixels to improve the predicted depth.
An extensive evaluation of our method shows that we set the new state of the art in supervised monocular depth estimation.
arXiv Detail & Related papers (2022-04-05T10:03:52Z) - Least Square Estimation Network for Depth Completion [11.840223815711004]
In this paper, we propose an effective image representation method for depth completion tasks.
The input of our system is a monocular camera frame and the synchronous sparse depth map.
Experiments show that our results beat the state-of-the-art on NYU-Depth-V2 dataset both in accuracy and runtime.
arXiv Detail & Related papers (2022-03-07T11:52:57Z) - IB-MVS: An Iterative Algorithm for Deep Multi-View Stereo based on
Binary Decisions [0.0]
We present a novel deep-learning-based method for Multi-View Stereo.
Our method estimates high resolution and highly precise depth maps iteratively, by traversing the continuous space of feasible depth values at each pixel in a binary decision fashion.
We compare our method with state-of-the-art Multi-View Stereo methods on the DTU, Tanks and Temples and the challenging ETH3D benchmarks and show competitive results.
arXiv Detail & Related papers (2021-11-29T10:04:24Z) - Learning Joint 2D-3D Representations for Depth Completion [90.62843376586216]
We design a simple yet effective neural network block that learns to extract joint 2D and 3D features.
Specifically, the block consists of two domain-specific sub-networks that apply 2D convolution on image pixels and continuous convolution on 3D points.
arXiv Detail & Related papers (2020-12-22T22:58:29Z) - Dual Pixel Exploration: Simultaneous Depth Estimation and Image
Restoration [77.1056200937214]
We study the formation of the DP pair which links the blur and the depth information.
We propose an end-to-end DDDNet (DP-based Depth and De Network) to jointly estimate the depth and restore the image.
arXiv Detail & Related papers (2020-12-01T06:53:57Z) - Defocus Blur Detection via Depth Distillation [64.78779830554731]
We introduce depth information into DBD for the first time.
In detail, we learn the defocus blur from ground truth and the depth distilled from a well-trained depth estimation network.
Our approach outperforms 11 other state-of-the-art methods on two popular datasets.
arXiv Detail & Related papers (2020-07-16T04:58:09Z) - Towards Dense People Detection with Deep Learning and Depth images [9.376814409561726]
This paper proposes a DNN-based system that detects multiple people from a single depth image.
Our neural network processes a depth image and outputs a likelihood map in image coordinates.
We show this strategy to be effective, producing networks that generalize to work with scenes different from those used during training.
arXiv Detail & Related papers (2020-07-14T16:43:02Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.