Related papers: DR-SPAAM: A Spatial-Attention and Auto-regressive Model for Person Detection in 2D Range Data

DR-SPAAM: A Spatial-Attention and Auto-regressive Model for Person Detection in 2D Range Data

URL: http://arxiv.org/abs/2004.14079v2
Date: Fri, 31 Jul 2020 16:43:53 GMT
Title: DR-SPAAM: A Spatial-Attention and Auto-regressive Model for Person Detection in 2D Range Data
Authors: Dan Jia, Alexander Hermans, and Bastian Leibe
Abstract summary: We propose a person detection network which uses an alternative strategy to combine scans obtained at different times. DR-SPAAM keeps the intermediate features from the backbone network as a template and recurrently updates the template when a new scan becomes available. On the DROW dataset, our method outperforms the existing state-of-the-art, while being approximately four times faster.
Score: 81.06749792332641
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Detecting persons using a 2D LiDAR is a challenging task due to the low information content of 2D range data. To alleviate the problem caused by the sparsity of the LiDAR points, current state-of-the-art methods fuse multiple previous scans and perform detection using the combined scans. The downside of such a backward looking fusion is that all the scans need to be aligned explicitly, and the necessary alignment operation makes the whole pipeline more expensive -- often too expensive for real-world applications. In this paper, we propose a person detection network which uses an alternative strategy to combine scans obtained at different times. Our method, Distance Robust SPatial Attention and Auto-regressive Model (DR-SPAAM), follows a forward looking paradigm. It keeps the intermediate features from the backbone network as a template and recurrently updates the template when a new scan becomes available. The updated feature template is in turn used for detecting persons currently in the scene. On the DROW dataset, our method outperforms the existing state-of-the-art, while being approximately four times faster, running at 87.2 FPS on a laptop with a dedicated GPU and at 22.6 FPS on an NVIDIA Jetson AGX embedded GPU. We release our code in PyTorch and a ROS node including pre-trained models.

Related papers

What Matters in Range View 3D Object Detection [15.147558647138629]
Lidar-based perception pipelines rely on 3D object detection models to interpret complex scenes. We achieve state-of-the-art amongst range-view 3D object detection models without using multiple techniques proposed in past range-view literature.
arXiv Detail & Related papers (2024-07-23T18:42:37Z)
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images. We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image. We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z)
Fully Sparse Fusion for 3D Object Detection [69.32694845027927]
Currently prevalent multimodal 3D detection methods are built upon LiDAR-based detectors that usually use dense Bird's-Eye-View feature maps. Fully sparse architecture is gaining attention as they are highly efficient in long-range perception. In this paper, we study how to effectively leverage image modality in the emerging fully sparse architecture.
arXiv Detail & Related papers (2023-04-24T17:57:43Z)
MSF3DDETR: Multi-Sensor Fusion 3D Detection Transformer for Autonomous Driving [0.0]
We propose MSF3DDETR: Multi-Sensor Fusion 3D Detection Transformer architecture to fuse image and LiDAR features to improve the detection accuracy. Our end-to-end single-stage, anchor-free and NMS-free network takes in multi-view images and LiDAR point clouds and predicts 3D bounding boxes. MSF3DDETR network is trained end-to-end on the nuScenes dataset using Hungarian algorithm based bipartite matching and set-to-set loss inspired by DETR.
arXiv Detail & Related papers (2022-10-27T10:55:15Z)
P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation [78.83305967085413]
This paper introduces a novel Pre-trained Spatial Temporal Many-to-One (P-STMO) model for 2D-to-3D human pose estimation task. Our method outperforms state-of-the-art methods with fewer parameters and less computational overhead.
arXiv Detail & Related papers (2022-03-15T04:00:59Z)
Embracing Single Stride 3D Object Detector with Sparse Transformer [63.179720817019096]
In LiDAR-based 3D object detection for autonomous driving, the ratio of the object size to input scene size is significantly smaller compared to 2D detection cases. Many 3D detectors directly follow the common practice of 2D detectors, which downsample the feature maps even after quantizing the point clouds. We propose Single-stride Sparse Transformer (SST) to maintain the original resolution from the beginning to the end of the network.
arXiv Detail & Related papers (2021-12-13T02:12:02Z)
2nd Place Solution for Waymo Open Dataset Challenge - Real-time 2D Object Detection [26.086623067939605]
In this report, we introduce a real-time method to detect the 2D objects from images. We leverage accelerationRT to optimize the inference time of our detection pipeline. Our framework achieves the latency of 45.8ms/frame on an Nvidia Tesla V100 GPU.
arXiv Detail & Related papers (2021-06-16T11:32:03Z)
Temp-Frustum Net: 3D Object Detection with Temporal Fusion [0.0]
3D object detection is a core component of automated driving systems. Frame-by-frame 3D object detection suffers from noise, field-of-view obstruction, and sparsity. We propose a novel Temporal Fusion Module to mitigate these problems.
arXiv Detail & Related papers (2021-04-25T09:08:14Z)
Self-Supervised Person Detection in 2D Range Data using a Calibrated Camera [83.31666463259849]
We propose a method to automatically generate training labels (called pseudo-labels) for 2D LiDAR-based person detectors. We show that self-supervised detectors, trained or fine-tuned with pseudo-labels, outperform detectors trained using manual annotations. Our method is an effective way to improve person detectors during deployment without any additional labeling effort.
arXiv Detail & Related papers (2020-12-16T12:10:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.