Explainable Action Prediction through Self-Supervision on Scene Graphs
- URL: http://arxiv.org/abs/2302.03477v1
- Date: Tue, 7 Feb 2023 14:05:02 GMT
- Title: Explainable Action Prediction through Self-Supervision on Scene Graphs
- Authors: Pawit Kochakarn, Daniele De Martini, Daniel Omeiza, Lars Kunze
- Abstract summary: This work explores scene graphs as a distilled representation of high-level information for autonomous driving.
We propose a self-supervision pipeline to infer representative and well-separated embeddings.
We evaluate our system on the ROAD dataset against a fully-supervised approach, showing the superiority of our training regime.
- Score: 8.028093810066247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work explores scene graphs as a distilled representation of high-level
information for autonomous driving, applied to future driver-action prediction.
Given the scarcity and strong imbalance of data samples, we propose a
self-supervision pipeline to infer representative and well-separated
embeddings. Key aspects are interpretability and explainability; as such, we
embed in our architecture attention mechanisms that can create spatial and
temporal heatmaps on the scene graphs. We evaluate our system on the ROAD
dataset against a fully-supervised approach, showing the superiority of our
training regime.
Related papers
- Enhancing End-to-End Autonomous Driving with Latent World Model [78.22157677787239]
We propose a novel self-supervised method to enhance end-to-end driving without the need for costly labels.
Our framework textbfLAW uses a LAtent World model to predict future latent features based on the predicted ego actions and the latent feature of the current frame.
As a result, our approach achieves state-of-the-art performance in both open-loop and closed-loop benchmarks without costly annotations.
arXiv Detail & Related papers (2024-06-12T17:59:21Z) - JRDB-Traj: A Dataset and Benchmark for Trajectory Forecasting in Crowds [79.00975648564483]
Trajectory forecasting models, employed in fields such as robotics, autonomous vehicles, and navigation, face challenges in real-world scenarios.
This dataset provides comprehensive data, including the locations of all agents, scene images, and point clouds, all from the robot's perspective.
The objective is to predict the future positions of agents relative to the robot using raw sensory input data.
arXiv Detail & Related papers (2023-11-05T18:59:31Z) - Vehicle Motion Forecasting using Prior Information and Semantic-assisted
Occupancy Grid Maps [6.99274104609965]
Motion is a challenging task for autonomous vehicles due to uncertainty in the sensor data, the non-deterministic nature of future, and complex behavior.
In this paper, we tackle this problem by representing the scene as dynamic occupancy grid maps (DOGMs)
We propose a novel framework that combines deep-temporal and probabilistic approaches to predict vehicle behaviors.
arXiv Detail & Related papers (2023-08-08T14:49:44Z) - LOPR: Latent Occupancy PRediction using Generative Models [49.15687400958916]
LiDAR generated occupancy grid maps (L-OGMs) offer a robust bird's eye-view scene representation.
We propose a framework that decouples occupancy prediction into: representation learning and prediction within the learned latent space.
arXiv Detail & Related papers (2022-10-03T22:04:00Z) - Conditioned Human Trajectory Prediction using Iterative Attention Blocks [70.36888514074022]
We present a simple yet effective pedestrian trajectory prediction model aimed at pedestrians positions prediction in urban-like environments.
Our model is a neural-based architecture that can run several layers of attention blocks and transformers in an iterative sequential fashion.
We show that without explicit introduction of social masks, dynamical models, social pooling layers, or complicated graph-like structures, it is possible to produce on par results with SoTA models.
arXiv Detail & Related papers (2022-06-29T07:49:48Z) - StopNet: Scalable Trajectory and Occupancy Prediction for Urban
Autonomous Driving [14.281088967734098]
We introduce a motion forecasting (behavior prediction) method that meets the latency requirements for autonomous driving in dense urban environments without sacrificing accuracy.
A whole-scene sparse input representation allows StopNet to scale to predicting trajectories for hundreds of road agents with reliable latency.
In addition to predicting trajectories, our scene encoder lends itself to predicting whole-scene probabilistic occupancy grids.
arXiv Detail & Related papers (2022-06-02T11:22:27Z) - Importance is in your attention: agent importance prediction for
autonomous driving [4.176937532441124]
Trajectory prediction is an important task in autonomous driving.
We show that attention information can also be used to measure the importance of each agent with respect to the ego vehicle's future planned trajectory.
arXiv Detail & Related papers (2022-04-19T20:34:30Z) - Self-Supervised Action-Space Prediction for Automated Driving [0.0]
We present a novel learned multi-modal trajectory prediction architecture for automated driving.
It achieves kinematically feasible predictions by casting the learning problem into the space of accelerations and steering angles.
The proposed methods are evaluated on real-world datasets containing urban intersections and roundabouts.
arXiv Detail & Related papers (2021-09-21T08:27:56Z) - Self-Supervision by Prediction for Object Discovery in Videos [62.87145010885044]
In this paper, we use the prediction task as self-supervision and build a novel object-centric model for image sequence representation.
Our framework can be trained without the help of any manual annotation or pretrained network.
Initial experiments confirm that the proposed pipeline is a promising step towards object-centric video prediction.
arXiv Detail & Related papers (2021-03-09T19:14:33Z) - Implicit Latent Variable Model for Scene-Consistent Motion Forecasting [78.74510891099395]
In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data.
We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene.
arXiv Detail & Related papers (2020-07-23T14:31:25Z) - Social-WaGDAT: Interaction-aware Trajectory Prediction via Wasserstein
Graph Double-Attention Network [29.289670231364788]
In this paper, we propose a generic generative neural system for multi-agent trajectory prediction.
We also employ an efficient kinematic constraint layer applied to vehicle trajectory prediction.
The proposed system is evaluated on three public benchmark datasets for trajectory prediction.
arXiv Detail & Related papers (2020-02-14T20:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.