Related papers: Semantic and Articulated Pedestrian Sensing Onboard a Moving Vehicle

Semantic and Articulated Pedestrian Sensing Onboard a Moving Vehicle

URL: http://arxiv.org/abs/2309.06313v1
Date: Tue, 12 Sep 2023 15:24:26 GMT
Title: Semantic and Articulated Pedestrian Sensing Onboard a Moving Vehicle
Authors: Maria Priisalu
Abstract summary: It is difficult to perform 3D reconstruction from on-vehicle gathered video due to the large forward motion of the vehicle. Recently Light Detection And Ranging (LiDAR) sensors have become popular to directly estimate depths without the need to perform 3D reconstructions. We hypothesize that benchmarks targeted at articulated human sensing from LiDAR data could bring about increased research in human sensing and prediction in traffic.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: It is difficult to perform 3D reconstruction from on-vehicle gathered video due to the large forward motion of the vehicle. Even object detection and human sensing models perform significantly worse on onboard videos when compared to standard benchmarks because objects often appear far away from the camera compared to the standard object detection benchmarks, image quality is often decreased by motion blur and occlusions occur often. This has led to the popularisation of traffic data-specific benchmarks. Recently Light Detection And Ranging (LiDAR) sensors have become popular to directly estimate depths without the need to perform 3D reconstructions. However, LiDAR-based methods still lack in articulated human detection at a distance when compared to image-based methods. We hypothesize that benchmarks targeted at articulated human sensing from LiDAR data could bring about increased research in human sensing and prediction in traffic and could lead to improved traffic safety for pedestrians.

Related papers

Vision-based Lifting of 2D Object Detections for Automated Driving [8.321333802704446]
We propose a pipeline which lifts the results of existing vision-based 2D algorithms to 3D detections using only cameras.<n>To the best of our knowledge, we are the first using a 2D CNN to process the point cloud for each 2D detection to keep the computational effort as low as possible.
arXiv Detail & Related papers (2025-06-13T14:40:12Z)
Learning better representations for crowded pedestrians in offboard LiDAR-camera 3D tracking-by-detection [14.56852056332248]
We build an offboard auto-labeling system that reconstructs pedestrian trajectories from LiDAR point cloud and multi-view images.<n>Our approach significantly improves the 3D pedestrian tracking performance towards higher auto-labeling efficiency.
arXiv Detail & Related papers (2025-05-21T21:18:26Z)
Street Gaussians without 3D Object Tracker [86.62329193275916]
Existing methods rely on labor-intensive manual labeling of object poses to reconstruct dynamic objects in canonical space. We propose a stable object tracking module by leveraging associations from 2D deep trackers within a 3D object fusion strategy. We address inevitable tracking errors by further introducing a motion learning strategy in an implicit feature space that autonomously corrects trajectory errors and recovers missed detections.
arXiv Detail & Related papers (2024-12-07T05:49:42Z)
Better Monocular 3D Detectors with LiDAR from the Past [64.6759926054061]
Camera-based 3D detectors often suffer inferior performance compared to LiDAR-based counterparts due to inherent depth ambiguities in images. In this work, we seek to improve monocular 3D detectors by leveraging unlabeled historical LiDAR data. We show consistent and significant performance gain across multiple state-of-the-art models and datasets with a negligible additional latency of 9.66 ms and a small storage cost.
arXiv Detail & Related papers (2024-04-08T01:38:43Z)
CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking [40.630532348405595]
Camera-RADAR 3D Detection and Tracking (CR3DT) is a camera-RADAR fusion model for 3D object detection, and Multi-Object Tracking (MOT) Building upon the foundations of the State-of-the-Art (SotA) camera-only BEVDet architecture, CR3DT demonstrates substantial improvements in both detection and tracking capabilities.
arXiv Detail & Related papers (2024-03-22T16:06:05Z)
LET-3D-AP: Longitudinal Error Tolerant 3D Average Precision for Camera-Only 3D Detection [26.278496981844317]
We propose variants of the 3D AP metric to be more permissive with respect to depth estimation errors. Specifically, our novel longitudinal error tolerant metrics, LET-3D-AP and LET-3D-APL, allow longitudinal localization errors up to a given tolerance. We find that state-of-the-art camera-based detectors can outperform popular LiDAR-based detectors with our new metrics past 10% depth error tolerance.
arXiv Detail & Related papers (2022-06-15T17:57:41Z)
Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection [58.81316192862618]
Two critical sensors for 3D perception in autonomous driving are the camera and the LiDAR. fusing these two modalities can significantly boost the performance of 3D perception models. We benchmark the state-of-the-art fusion methods for the first time.
arXiv Detail & Related papers (2022-05-30T09:35:37Z)
Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception [59.2014692323323]
Small, far-away, or highly occluded objects are particularly challenging because there is limited information in the LiDAR point clouds for detecting them. We propose a novel, end-to-end trainable Hindsight framework to extract contextual information from past data. We show that this framework is compatible with most modern 3D detection architectures and can substantially improve their average precision on multiple autonomous driving datasets.
arXiv Detail & Related papers (2022-03-22T00:58:27Z)
Real-Time Human Pose Estimation on a Smart Walker using Convolutional Neural Networks [4.076099054649463]
We present a novel approach to patient monitoring and data-driven human-in-the-loop control in the context of smart walkers. It is able to extract a complete and compact body representation in real-time and from inexpensive sensors. Despite promising results, more data should be collected on users with impairments to assess its performance as a rehabilitation tool in real-world scenarios.
arXiv Detail & Related papers (2021-06-28T14:11:48Z)
Exploiting Playbacks in Unsupervised Domain Adaptation for 3D Object Detection [55.12894776039135]
State-of-the-art 3D object detectors, based on deep learning, have shown promising accuracy but are prone to over-fit to domain idiosyncrasies. We propose a novel learning approach that drastically reduces this gap by fine-tuning the detector on pseudo-labels in the target domain. We show, on five autonomous driving datasets, that fine-tuning the detector on these pseudo-labels substantially reduces the domain gap to new driving environments.
arXiv Detail & Related papers (2021-03-26T01:18:11Z)
Monocular Quasi-Dense 3D Object Tracking [99.51683944057191]
A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects and planning the observer's actions in numerous applications such as autonomous driving. We propose a framework that can effectively associate moving objects over time and estimate their full 3D bounding box information from a sequence of 2D images captured on a moving platform.
arXiv Detail & Related papers (2021-03-12T15:30:02Z)
Recovering and Simulating Pedestrians in the Wild [81.38135735146015]
We propose to recover the shape and motion of pedestrians from sensor readings captured in the wild by a self-driving car driving around. We incorporate the reconstructed pedestrian assets bank in a realistic 3D simulation system. We show that the simulated LiDAR data can be used to significantly reduce the amount of real-world data required for visual perception tasks.
arXiv Detail & Related papers (2020-11-16T17:16:32Z)
SemanticVoxels: Sequential Fusion for 3D Pedestrian Detection using LiDAR Point Cloud and Semantic Segmentation [4.350338899049983]
We propose a generalization of PointPainting to be able to apply fusion at different levels. We show that SemanticVoxels achieves state-of-the-art performance in both 3D and bird's eye view pedestrian detection benchmarks.
arXiv Detail & Related papers (2020-09-25T14:52:32Z)
Cityscapes 3D: Dataset and Benchmark for 9 DoF Vehicle Detection [7.531596091318718]
We propose Cityscapes 3D, extending the original Cityscapes dataset with 3D bounding box annotations for all types of vehicles. In contrast to existing datasets, our 3D annotations were labeled using stereo RGB images only and capture all nine degrees of freedom. In addition, we complement the Cityscapes benchmark suite with 3D vehicle detection based on the new annotations as well as metrics presented in this work.
arXiv Detail & Related papers (2020-06-14T10:56:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.