Deep Learning for Vision-based Prediction: A Survey
- URL: http://arxiv.org/abs/2007.00095v2
- Date: Wed, 22 Jul 2020 15:33:06 GMT
- Title: Deep Learning for Vision-based Prediction: A Survey
- Authors: Amir Rasouli
- Abstract summary: Vision-based prediction algorithms have a wide range of applications including autonomous driving, surveillance, human-robot interaction, weather prediction.
This paper provides an overview of the field in the past five years with a particular focus on deep learning approaches.
- Score: 6.840474688871695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vision-based prediction algorithms have a wide range of applications
including autonomous driving, surveillance, human-robot interaction, weather
prediction. The objective of this paper is to provide an overview of the field
in the past five years with a particular focus on deep learning approaches. For
this purpose, we categorize these algorithms into video prediction, action
prediction, trajectory prediction, body motion prediction, and other prediction
applications. For each category, we highlight the common architectures,
training methods and types of data used. In addition, we discuss the common
evaluation metrics and datasets used for vision-based prediction tasks. A
database of all the information presented in this survey including,
cross-referenced according to papers, datasets and metrics, can be found online
at https://github.com/aras62/vision-based-prediction.
Related papers
- Human Action Anticipation: A Survey [86.415721659234]
The literature on behavior prediction spans various tasks, including action anticipation, activity forecasting, intent prediction, goal prediction, and so on.
Our survey aims to tie together this fragmented literature, covering recent technical innovations as well as the development of new large-scale datasets for model training and evaluation.
arXiv Detail & Related papers (2024-10-17T21:37:40Z) - A Control-Centric Benchmark for Video Prediction [69.22614362800692]
We propose a benchmark for action-conditioned video prediction in the form of a control benchmark.
Our benchmark includes simulated environments with 11 task categories and 310 task instance definitions.
We then leverage our benchmark to study the effects of scaling model size, quantity of training data, and model ensembling.
arXiv Detail & Related papers (2023-04-26T17:59:45Z) - Multi-Vehicle Trajectory Prediction at Intersections using State and
Intention Information [50.40632021583213]
Traditional approaches to prediction of future trajectory of road agents rely on knowing information about their past trajectory.
This work instead relies on having knowledge of the current state and intended direction to make predictions for multiple vehicles at intersections.
Message passing of this information between the vehicles provides each one of them a more holistic overview of the environment.
arXiv Detail & Related papers (2023-01-06T15:13:23Z) - 3D Human Motion Prediction: A Survey [23.605334184939164]
3D human motion prediction, predicting future poses from a given sequence, is an issue of great significance and challenge in computer vision and machine intelligence.
A comprehensive survey on 3D human motion prediction is conducted for the purpose of retrospecting and analyzing relevant works from existing released literature.
arXiv Detail & Related papers (2022-03-03T09:46:43Z) - SLPC: a VRNN-based approach for stochastic lidar prediction and
completion in autonomous driving [63.87272273293804]
We propose a new LiDAR prediction framework that is based on generative models namely Variational Recurrent Neural Networks (VRNNs)
Our algorithm is able to address the limitations of previous video prediction frameworks when dealing with sparse data by spatially inpainting the depth maps in the upcoming frames.
We present a sparse version of VRNNs and an effective self-supervised training method that does not require any labels.
arXiv Detail & Related papers (2021-02-19T11:56:44Z) - PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction
in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes.
In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z) - A Review on Deep Learning Techniques for Video Prediction [3.203688549673373]
The ability to predict, anticipate and reason about future outcomes is a key component of intelligent decision-making systems.
Deep learning-based video prediction emerged as a promising research direction.
arXiv Detail & Related papers (2020-04-10T19:58:44Z) - Deep Learning for Content-based Personalized Viewport Prediction of
360-Degree VR Videos [72.08072170033054]
In this paper, a deep learning network is introduced to leverage position data as well as video frame content to predict future head movement.
For optimizing data input into this neural network, data sample rate, reduced data, and long-period prediction length are also explored for this model.
arXiv Detail & Related papers (2020-03-01T07:31:50Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.