Multimodal Future Localization and Emergence Prediction for Objects in
Egocentric View with a Reachability Prior
- URL: http://arxiv.org/abs/2006.04700v1
- Date: Mon, 8 Jun 2020 15:57:26 GMT
- Title: Multimodal Future Localization and Emergence Prediction for Objects in
Egocentric View with a Reachability Prior
- Authors: Osama Makansi, \"Ozg\"un Cicek, Kevin Buchicchio, Thomas Brox
- Abstract summary: We investigate the problem of anticipating future dynamics, particularly the future location of other vehicles and pedestrians, in the view of a moving vehicle.
We estimate a reachability prior for certain classes of objects from the semantic map of the present image and propagate it into the future using the planned egomotion.
Experiments show that the reachability prior combined with multi-hypotheses learning improves multimodal prediction of the future location of tracked objects and, for the first time, the emergence of new objects.
- Score: 36.80686175878314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the problem of anticipating future dynamics,
particularly the future location of other vehicles and pedestrians, in the view
of a moving vehicle. We approach two fundamental challenges: (1) the partial
visibility due to the egocentric view with a single RGB camera and considerable
field-of-view change due to the egomotion of the vehicle; (2) the multimodality
of the distribution of future states. In contrast to many previous works, we do
not assume structural knowledge from maps. We rather estimate a reachability
prior for certain classes of objects from the semantic map of the present image
and propagate it into the future using the planned egomotion. Experiments show
that the reachability prior combined with multi-hypotheses learning improves
multimodal prediction of the future location of tracked objects and, for the
first time, the emergence of new objects. We also demonstrate promising
zero-shot transfer to unseen datasets. Source code is available at
$\href{https://github.com/lmb-freiburg/FLN-EPN-RPN}{\text{this https URL.}}$
Related papers
- Seeing the Future, Perceiving the Future: A Unified Driving World Model for Future Generation and Perception [47.65526944865586]
We present UniFuture, a driving world model that seamlessly integrates future scene generation and perception within a single framework.
Our approach jointly models future appearance (i.e., RGB image) and geometry (i.e., depth), ensuring coherent predictions.
arXiv Detail & Related papers (2025-03-17T17:59:50Z) - Multi-Vehicle Trajectory Prediction at Intersections using State and
Intention Information [50.40632021583213]
Traditional approaches to prediction of future trajectory of road agents rely on knowing information about their past trajectory.
This work instead relies on having knowledge of the current state and intended direction to make predictions for multiple vehicles at intersections.
Message passing of this information between the vehicles provides each one of them a more holistic overview of the environment.
arXiv Detail & Related papers (2023-01-06T15:13:23Z) - ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal
Feature Learning [132.20119288212376]
We propose a spatial-temporal feature learning scheme towards a set of more representative features for perception, prediction and planning tasks simultaneously.
To the best of our knowledge, we are the first to systematically investigate each part of an interpretable end-to-end vision-based autonomous driving system.
arXiv Detail & Related papers (2022-07-15T16:57:43Z) - Predicting Future Occupancy Grids in Dynamic Environment with
Spatio-Temporal Learning [63.25627328308978]
We propose a-temporal prediction network pipeline to generate future occupancy predictions.
Compared to current SOTA, our approach predicts occupancy for a longer horizon of 3 seconds.
We publicly release our grid occupancy dataset based on nulis to support further research.
arXiv Detail & Related papers (2022-05-06T13:45:32Z) - Learning Future Object Prediction with a Spatiotemporal Detection
Transformer [1.1543275835002982]
We train a detection transformer to directly output future objects.
We extend existing transformers in two ways to capture scene dynamics.
Our final approach learns to capture the dynamics and make predictions on par with an oracle for 100 ms prediction horizons.
arXiv Detail & Related papers (2022-04-21T17:58:36Z) - Vision-Guided Forecasting -- Visual Context for Multi-Horizon Time
Series Forecasting [0.6947442090579469]
We tackle multi-horizon forecasting of vehicle states by fusing the two modalities.
We design and experiment with 3D convolutions for visual features extraction and 1D convolutions for features extraction from speed and steering angle traces.
We show that we are able to forecast a vehicle's state to various horizons, while outperforming the current state-of-the-art results on the related task of driving state estimation.
arXiv Detail & Related papers (2021-07-27T08:52:40Z) - FIERY: Future Instance Prediction in Bird's-Eye View from Surround
Monocular Cameras [33.08698074581615]
We present FIERY: a probabilistic future prediction model in bird's-eye view from monocular cameras.
Our approach combines the perception, sensor fusion and prediction components of a traditional autonomous driving stack.
We show that our model outperforms previous prediction baselines on the NuScenes and Lyft datasets.
arXiv Detail & Related papers (2021-04-21T12:21:40Z) - Learning to Anticipate Egocentric Actions by Imagination [60.21323541219304]
We study the egocentric action anticipation task, which predicts future action seconds before it is performed for egocentric videos.
Our method significantly outperforms previous methods on both the seen test set and the unseen test set of the EPIC Kitchens Action Anticipation Challenge.
arXiv Detail & Related papers (2021-01-13T08:04:10Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.