Phase Space Reconstruction Network for Lane Intrusion Action Recognition
- URL: http://arxiv.org/abs/2102.11149v1
- Date: Mon, 22 Feb 2021 16:18:35 GMT
- Title: Phase Space Reconstruction Network for Lane Intrusion Action Recognition
- Authors: Ruiwen Zhang and Zhidong Deng and Hongsen Lin and Hongchao Lu
- Abstract summary: In this paper, we propose a novel object-level phase space reconstruction network (PSRNet) for motion time series classification.
Our PSRNet could reach the best accuracy of 98.0%, which remarkably exceeds existing action recognition approaches by more than 30%.
- Score: 9.351931162958465
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In a complex road traffic scene, illegal lane intrusion of pedestrians or
cyclists constitutes one of the main safety challenges in autonomous driving
application. In this paper, we propose a novel object-level phase space
reconstruction network (PSRNet) for motion time series classification, aiming
to recognize lane intrusion actions that occur 150m ahead through a monocular
camera fixed on moving vehicle. In the PSRNet, the movement of pedestrians and
cyclists, specifically viewed as an observable object-level dynamic process,
can be reconstructed as trajectories of state vectors in a latent phase space
and further characterized by a learnable Lyapunov exponent-like classifier that
indicates discrimination in terms of average exponential divergence of state
trajectories. Additionally, in order to first transform video inputs into
one-dimensional motion time series of each object, a lane width normalization
based on visual object tracking-by-detection is presented. Extensive
experiments are conducted on the THU-IntrudBehavior dataset collected from real
urban roads. The results show that our PSRNet could reach the best accuracy of
98.0%, which remarkably exceeds existing action recognition approaches by more
than 30%.
Related papers
- Instantaneous Perception of Moving Objects in 3D [86.38144604783207]
The perception of 3D motion of surrounding traffic participants is crucial for driving safety.
We propose to leverage local occupancy completion of object point clouds to densify the shape cue, and mitigate the impact of swimming artifacts.
Extensive experiments demonstrate superior performance compared to standard 3D motion estimation approaches.
arXiv Detail & Related papers (2024-05-05T01:07:24Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - A Memory-Augmented Multi-Task Collaborative Framework for Unsupervised
Traffic Accident Detection in Driving Videos [22.553356096143734]
We propose a novel memory-augmented multi-task collaborative framework (MAMTCF) for unsupervised traffic accident detection in driving videos.
Our method can more accurately detect both ego-involved and non-ego accidents by simultaneously modeling appearance changes and object motions in video frames.
arXiv Detail & Related papers (2023-07-27T01:45:13Z) - MotionTrack: Learning Motion Predictor for Multiple Object Tracking [68.68339102749358]
We introduce a novel motion-based tracker, MotionTrack, centered around a learnable motion predictor.
Our experimental results demonstrate that MotionTrack yields state-of-the-art performance on datasets such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2023-06-05T04:24:11Z) - M$^2$DAR: Multi-View Multi-Scale Driver Action Recognition with Vision
Transformer [5.082919518353888]
We present a multi-view, multi-scale framework for naturalistic driving action recognition and localization in untrimmed videos.
Our system features a weight-sharing, multi-scale Transformer-based action recognition network that learns robust hierarchical representations.
arXiv Detail & Related papers (2023-05-13T02:38:15Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - Real-Time Accident Detection in Traffic Surveillance Using Deep Learning [0.8808993671472349]
This paper presents a new efficient framework for accident detection at intersections for traffic surveillance applications.
The proposed framework consists of three hierarchical steps, including efficient and accurate object detection based on the state-of-the-art YOLOv4 method.
The robustness of the proposed framework is evaluated using video sequences collected from YouTube with diverse illumination conditions.
arXiv Detail & Related papers (2022-08-12T19:07:20Z) - Traffic-Net: 3D Traffic Monitoring Using a Single Camera [1.1602089225841632]
We provide a practical platform for real-time traffic monitoring using a single CCTV traffic camera.
We adapt a custom YOLOv5 deep neural network model for vehicle/pedestrian detection and an enhanced SORT tracking algorithm.
We also develop a hierarchical traffic modelling solution based on short- and long-term temporal video data stream.
arXiv Detail & Related papers (2021-09-19T16:59:01Z) - Real Time Monocular Vehicle Velocity Estimation using Synthetic Data [78.85123603488664]
We look at the problem of estimating the velocity of road vehicles from a camera mounted on a moving car.
We propose a two-step approach where first an off-the-shelf tracker is used to extract vehicle bounding boxes and then a small neural network is used to regress the vehicle velocity.
arXiv Detail & Related papers (2021-09-16T13:10:27Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z) - Any Motion Detector: Learning Class-agnostic Scene Dynamics from a
Sequence of LiDAR Point Clouds [4.640835690336654]
We propose a novel real-time approach of temporal context aggregation for motion detection and motion parameters estimation.
We introduce an ego-motion compensation layer to achieve real-time inference with performance comparable to a naive odometric transform of the original point cloud sequence.
arXiv Detail & Related papers (2020-04-24T10:40:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.