RGB-D Railway Platform Monitoring and Scene Understanding for Enhanced
Passenger Safety
- URL: http://arxiv.org/abs/2102.11730v1
- Date: Tue, 23 Feb 2021 14:44:34 GMT
- Title: RGB-D Railway Platform Monitoring and Scene Understanding for Enhanced
Passenger Safety
- Authors: Marco Wallner, Daniel Steininger, Verena Widhalm, Matthias
Sch\"orghuber, Csaba Beleznai
- Abstract summary: This paper proposes a flexible analysis scheme to detect and track humans on a ground plane.
We consider multiple combinations within a set of RGB- and depth-based detection and tracking modalities.
Results indicate that the combined use of depth-based spatial information and learned representations yields substantially enhanced detection and tracking accuracies.
- Score: 3.4298729855744026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated monitoring and analysis of passenger movement in safety-critical
parts of transport infrastructures represent a relevant visual surveillance
task. Recent breakthroughs in visual representation learning and spatial
sensing opened up new possibilities for detecting and tracking humans and
objects within a 3D spatial context. This paper proposes a flexible analysis
scheme and a thorough evaluation of various processing pipelines to detect and
track humans on a ground plane, calibrated automatically via stereo depth and
pedestrian detection. We consider multiple combinations within a set of RGB-
and depth-based detection and tracking modalities. We exploit the modular
concepts of Meshroom [2] and demonstrate its use as a generic vision processing
pipeline and scalable evaluation framework. Furthermore, we introduce a novel
open RGB-D railway platform dataset with annotations to support research
activities in automated RGB-D surveillance. We present quantitative results for
multiple object detection and tracking for various algorithmic combinations on
our dataset. Results indicate that the combined use of depth-based spatial
information and learned representations yields substantially enhanced detection
and tracking accuracies. As demonstrated, these enhancements are especially
pronounced in adverse situations when occlusions and objects not captured by
learned representations are present.
Related papers
- InScope: A New Real-world 3D Infrastructure-side Collaborative Perception Dataset for Open Traffic Scenarios [13.821143687548494]
This paper introduces a new 3D infrastructure-side collaborative perception dataset, abbreviated as inscope.
InScope encapsulates a 20-day capture duration with 303 tracking trajectories and 187,787 3D bounding boxes annotated by experts.
arXiv Detail & Related papers (2024-07-31T13:11:14Z) - Realtime Dynamic Gaze Target Tracking and Depth-Level Estimation [6.435984242701043]
Transparent Displays (TD) in various applications, such as Heads-Up Displays (HUDs) in vehicles, is a burgeoning field, poised to revolutionize user experiences.
This innovation brings forth significant challenges in realtime human-device interaction, particularly in accurately identifying and tracking a user's gaze on dynamically changing TDs.
We present a two-fold robust and efficient systematic solution for realtime gaze monitoring, comprised of: (1) a tree-based algorithm for identifying and dynamically tracking gaze targets; and (2) a multi-stream self-attention architecture to estimate the depth-level of human gaze from eye tracking data.
arXiv Detail & Related papers (2024-06-09T20:52:47Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - Recent Advances in Embedding Methods for Multi-Object Tracking: A Survey [71.10448142010422]
Multi-object tracking (MOT) aims to associate target objects across video frames in order to obtain entire moving trajectories.
Embedding methods play an essential role in object location estimation and temporal identity association in MOT.
We first conduct a comprehensive overview with in-depth analysis for embedding methods in MOT from seven different perspectives.
arXiv Detail & Related papers (2022-05-22T06:54:33Z) - Comparative study of 3D object detection frameworks based on LiDAR data
and sensor fusion techniques [0.0]
The perception system plays a significant role in providing an accurate interpretation of a vehicle's environment in real-time.
Deep learning techniques transform the huge amount of data from the sensors into semantic information.
3D object detection methods, by utilizing the additional pose data from the sensors such as LiDARs, stereo cameras, provides information on the size and location of the object.
arXiv Detail & Related papers (2022-02-05T09:34:58Z) - Deep Feature Tracker: A Novel Application for Deep Convolutional Neural
Networks [0.0]
We propose a novel and unified deep learning-based approach that can learn how to track features reliably.
The proposed network dubbed as Deep-PT consists of a tracker network which is a convolutional neural network cross-correlation.
The network is trained using multiple datasets due to the lack of specialized dataset for feature tracking datasets.
arXiv Detail & Related papers (2021-07-30T23:24:29Z) - Artificial Intelligence Enabled Traffic Monitoring System [3.085453921856008]
This article presents a novel approach to automatically monitor real time traffic footage using deep convolutional neural networks.
The proposed system deploys several state-of-the-art deep learning algorithms to automate different traffic monitoring needs.
arXiv Detail & Related papers (2020-10-02T22:28:02Z) - Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for
Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties.
Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates.
The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z) - Visual Tracking by TridentAlign and Context Embedding [71.60159881028432]
We propose novel TridentAlign and context embedding modules for Siamese network-based visual tracking methods.
The performance of the proposed tracker is comparable to that of state-of-the-art trackers, while the proposed tracker runs at real-time speed.
arXiv Detail & Related papers (2020-07-14T08:00:26Z) - Benchmarking Unsupervised Object Representations for Video Sequences [111.81492107649889]
We compare the perceptual abilities of four object-centric approaches: ViMON, OP3, TBA and SCALOR.
Our results suggest that the architectures with unconstrained latent representations learn more powerful representations in terms of object detection, segmentation and tracking.
Our benchmark may provide fruitful guidance towards learning more robust object-centric video representations.
arXiv Detail & Related papers (2020-06-12T09:37:24Z) - Training-free Monocular 3D Event Detection System for Traffic
Surveillance [93.65240041833319]
Existing event detection systems are mostly learning-based and have achieved convincing performance when a large amount of training data is available.
In real-world scenarios, collecting sufficient labeled training data is expensive and sometimes impossible.
We propose a training-free monocular 3D event detection system for traffic surveillance.
arXiv Detail & Related papers (2020-02-01T04:42:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.