Towards Accurate Vehicle Behaviour Classification With Multi-Relational
Graph Convolutional Networks
- URL: http://arxiv.org/abs/2002.00786v3
- Date: Tue, 12 May 2020 17:49:11 GMT
- Title: Towards Accurate Vehicle Behaviour Classification With Multi-Relational
Graph Convolutional Networks
- Authors: Sravan Mylavarapu, Mahtab Sandhu, Priyesh Vijayan, K Madhava Krishna,
Balaraman Ravindran, Anoop Namboodiri
- Abstract summary: We propose a pipeline for understanding vehicle behaviour from a monocular image sequence or video.
A temporal sequence of such encodings is fed to a recurrent network to label vehicle behaviours.
The proposed framework can classify a variety of vehicle behaviours to high fidelity on datasets that are diverse.
- Score: 22.022759283770377
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding on-road vehicle behaviour from a temporal sequence of sensor
data is gaining in popularity. In this paper, we propose a pipeline for
understanding vehicle behaviour from a monocular image sequence or video. A
monocular sequence along with scene semantics, optical flow and object labels
are used to get spatial information about the object (vehicle) of interest and
other objects (semantically contiguous set of locations) in the scene. This
spatial information is encoded by a Multi-Relational Graph Convolutional
Network (MR-GCN), and a temporal sequence of such encodings is fed to a
recurrent network to label vehicle behaviours. The proposed framework can
classify a variety of vehicle behaviours to high fidelity on datasets that are
diverse and include European, Chinese and Indian on-road scenes. The framework
also provides for seamless transfer of models across datasets without entailing
re-annotation, retraining and even fine-tuning. We show comparative performance
gain over baseline Spatio-temporal classifiers and detail a variety of
ablations to showcase the efficacy of the framework.
Related papers
- Traffic Reconstruction and Analysis of Natural Driving Behaviors at
Unsignalized Intersections [1.7273380623090846]
This research involved recording traffic at various unsignalized intersections in Memphis, TN, during different times of the day.
After manually labeling video data to capture specific variables, we reconstructed traffic scenarios in the SUMO simulation environment.
The output data from these simulations offered a comprehensive analysis, including time-space diagrams for vehicle movement, travel time frequency distributions, and speed-position plots to identify bottleneck points.
arXiv Detail & Related papers (2023-12-22T09:38:06Z) - Graph Convolutional Networks for Complex Traffic Scenario Classification [0.7919810878571297]
A scenario-based testing approach can reduce the time required to obtain statistically significant evidence of the safety of Automated Driving Systems.
Most methods on scenario classification do not work for complex scenarios with diverse environments.
We propose a method for complex traffic scenario classification that is able to model the interaction of a vehicle with the environment.
arXiv Detail & Related papers (2023-10-26T20:51:24Z) - Temporal Embeddings: Scalable Self-Supervised Temporal Representation
Learning from Spatiotemporal Data for Multimodal Computer Vision [1.4127889233510498]
A novel approach is proposed to stratify landscape based on mobility activity time series.
The pixel-wise embeddings are converted to image-like channels that can be used for task-based, multimodal modeling.
arXiv Detail & Related papers (2023-10-16T02:53:29Z) - Traffic Scene Parsing through the TSP6K Dataset [109.69836680564616]
We introduce a specialized traffic monitoring dataset, termed TSP6K, with high-quality pixel-level and instance-level annotations.
The dataset captures more crowded traffic scenes with several times more traffic participants than the existing driving scenes.
We propose a detail refining decoder for scene parsing, which recovers the details of different semantic regions in traffic scenes.
arXiv Detail & Related papers (2023-03-06T02:05:14Z) - Self Supervised Clustering of Traffic Scenes using Graph Representations [2.658812114255374]
We present a data-driven method to cluster traffic scenes that is self-supervised, i.e. without manual labelling.
We leverage the semantic scene graph model to create a generic graph embedding of the traffic scene, which is then mapped to a low-dimensional embedding space using a Siamese network.
In the training process of our novel approach, we augment existing traffic scenes in the Cartesian space to generate positive similarity samples.
arXiv Detail & Related papers (2022-11-24T22:52:55Z) - Wide and Narrow: Video Prediction from Context and Motion [54.21624227408727]
We propose a new framework to integrate these complementary attributes to predict complex pixel dynamics through deep networks.
We present global context propagation networks that aggregate the non-local neighboring representations to preserve the contextual information over the past frames.
We also devise local filter memory networks that generate adaptive filter kernels by storing the motion of moving objects in the memory.
arXiv Detail & Related papers (2021-10-22T04:35:58Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Understanding Dynamic Scenes using Graph Convolution Networks [22.022759283770377]
We present a novel framework to model on-road vehicle behaviors from a sequence of temporally ordered frames as grabbed by a moving camera.
We show a seamless transfer of learning to multiple datasets without resorting to fine-tuning.
Such behavior prediction methods find immediate relevance in a variety of navigation tasks.
arXiv Detail & Related papers (2020-05-09T13:05:06Z) - VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized
Representation [74.56282712099274]
This paper introduces VectorNet, a hierarchical graph neural network that exploits the spatial locality of individual road components represented by vectors.
By operating on the vectorized high definition (HD) maps and agent trajectories, we avoid lossy rendering and computationally intensive ConvNet encoding steps.
We evaluate VectorNet on our in-house behavior prediction benchmark and the recently released Argoverse forecasting dataset.
arXiv Detail & Related papers (2020-05-08T19:07:03Z) - Parsing-based View-aware Embedding Network for Vehicle Re-Identification [138.11983486734576]
We propose a parsing-based view-aware embedding network (PVEN) to achieve the view-aware feature alignment and enhancement for vehicle ReID.
The experiments conducted on three datasets show that our model outperforms state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-04-10T13:06:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.