Fusion-GRU: A Deep Learning Model for Future Bounding Box Prediction of
Traffic Agents in Risky Driving Videos
- URL: http://arxiv.org/abs/2308.06628v1
- Date: Sat, 12 Aug 2023 18:35:59 GMT
- Title: Fusion-GRU: A Deep Learning Model for Future Bounding Box Prediction of
Traffic Agents in Risky Driving Videos
- Authors: Muhammad Monjurul Karim, Ruwen Qin, Yinhai Wang
- Abstract summary: Fusion-Gated Recurrent Unit (Fusion-GRU) is a novel encoder-decoder architecture for future bounding box localization.
The proposed method is evaluated on two publicly available datasets, ROL and HEV-I.
- Score: 20.923004256768635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To ensure the safe and efficient navigation of autonomous vehicles and
advanced driving assistance systems in complex traffic scenarios, predicting
the future bounding boxes of surrounding traffic agents is crucial. However,
simultaneously predicting the future location and scale of target traffic
agents from the egocentric view poses challenges due to the vehicle's egomotion
causing considerable field-of-view changes. Moreover, in anomalous or risky
situations, tracking loss or abrupt motion changes limit the available
observation time, requiring learning of cues within a short time window.
Existing methods typically use a simple concatenation operation to combine
different cues, overlooking their dynamics over time. To address this, this
paper introduces the Fusion-Gated Recurrent Unit (Fusion-GRU) network, a novel
encoder-decoder architecture for future bounding box localization. Unlike
traditional GRUs, Fusion-GRU accounts for mutual and complex interactions among
input features. Moreover, an intermediary estimator coupled with a
self-attention aggregation layer is also introduced to learn sequential
dependencies for long range prediction. Finally, a GRU decoder is employed to
predict the future bounding boxes. The proposed method is evaluated on two
publicly available datasets, ROL and HEV-I. The experimental results showcase
the promising performance of the Fusion-GRU, demonstrating its effectiveness in
predicting future bounding boxes of traffic agents.
Related papers
- CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions [13.981748780317329]
Accurately and promptly predicting accidents among surrounding traffic agents from camera footage is crucial for the safety of autonomous vehicles (AVs)
This study introduces a novel accident anticipation framework for AVs, termed CRASH.
It seamlessly integrates five components: object detector, feature extractor, object-aware module, context-aware module, and multi-layer fusion.
Our model surpasses existing top baselines in critical evaluation metrics like Average Precision (AP) and mean Time-To-Accident (mTTA)
arXiv Detail & Related papers (2024-07-25T04:12:49Z) - SocialFormer: Social Interaction Modeling with Edge-enhanced Heterogeneous Graph Transformers for Trajectory Prediction [3.733790302392792]
SocialFormer is an agent interaction-aware trajectory prediction method.
We present a temporal encoder based on gated recurrent units (GRU) to model the temporal social behavior of agent movements.
We evaluate SocialFormer for the trajectory prediction task on the popular nuScenes benchmark and achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-05-06T19:47:23Z) - Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Graph-Based Interaction-Aware Multimodal 2D Vehicle Trajectory
Prediction using Diffusion Graph Convolutional Networks [17.989423104706397]
This study presents the Graph-based Interaction-aware Multi-modal Trajectory Prediction framework.
Within this framework, vehicles' motions are conceptualized as nodes in a time-varying graph, and the traffic interactions are represented by a dynamic adjacency matrix.
We employ a driving intention-specific feature fusion, enabling the adaptive integration of historical and future embeddings.
arXiv Detail & Related papers (2023-09-05T06:28:13Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - Pedestrian Trajectory Prediction via Spatial Interaction Transformer
Network [7.150832716115448]
In traffic scenes, when encountering with oncoming people, pedestrians may make sudden turns or stop immediately.
To predict such unpredictable trajectories, we can gain insights into the interaction between pedestrians.
We present a novel generative method named Spatial Interaction Transformer (SIT), which learns the correlation of pedestrian trajectories through attention mechanisms.
arXiv Detail & Related papers (2021-12-13T13:08:04Z) - Decoder Fusion RNN: Context and Interaction Aware Decoders for
Trajectory Prediction [53.473846742702854]
We propose a recurrent, attention-based approach for motion forecasting.
Decoder Fusion RNN (DF-RNN) is composed of a recurrent behavior encoder, an inter-agent multi-headed attention module, and a context-aware decoder.
We demonstrate the efficacy of our method by testing it on the Argoverse motion forecasting dataset and show its state-of-the-art performance on the public benchmark.
arXiv Detail & Related papers (2021-08-12T15:53:37Z) - End-to-end Contextual Perception and Prediction with Interaction
Transformer [79.14001602890417]
We tackle the problem of detecting objects in 3D and forecasting their future motion in the context of self-driving.
To capture their spatial-temporal dependencies, we propose a recurrent neural network with a novel Transformer architecture.
Our model can be trained end-to-end, and runs in real-time.
arXiv Detail & Related papers (2020-08-13T14:30:12Z) - Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data.
We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z) - Probabilistic Crowd GAN: Multimodal Pedestrian Trajectory Prediction
using a Graph Vehicle-Pedestrian Attention Network [12.070251470948772]
We show how Probabilistic Crowd GAN can output probabilistic multimodal predictions.
We also propose the use of Graph Vehicle-Pedestrian Attention Network (GVAT), which models social interactions.
We demonstrate improvements on the existing state of the art methods for trajectory prediction and illustrate how the true multimodal and uncertain nature of crowd interactions can be directly modelled.
arXiv Detail & Related papers (2020-06-23T11:25:16Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.