Panoptic Segmentation Forecasting
- URL: http://arxiv.org/abs/2104.03962v1
- Date: Thu, 8 Apr 2021 17:59:16 GMT
- Title: Panoptic Segmentation Forecasting
- Authors: Colin Graber and Grace Tsai and Michael Firman and Gabriel Brostow and
Alexander Schwing
- Abstract summary: Our goal is to forecast the near future given a set of recent observations.
We think this ability to forecast, i.e., to anticipate, is integral for the success of autonomous agents.
We develop a two-component model: one component learns the dynamics of the background stuff by anticipating odometry, the other one anticipates the dynamics of detected things.
- Score: 71.75275164959953
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Our goal is to forecast the near future given a set of recent observations.
We think this ability to forecast, i.e., to anticipate, is integral for the
success of autonomous agents which need not only passively analyze an
observation but also must react to it in real-time. Importantly, accurate
forecasting hinges upon the chosen scene decomposition. We think that superior
forecasting can be achieved by decomposing a dynamic scene into individual
'things' and background 'stuff'. Background 'stuff' largely moves because of
camera motion, while foreground 'things' move because of both camera and
individual object motion. Following this decomposition, we introduce panoptic
segmentation forecasting. Panoptic segmentation forecasting opens up a
middle-ground between existing extremes, which either forecast instance
trajectories or predict the appearance of future image frames. To address this
task we develop a two-component model: one component learns the dynamics of the
background stuff by anticipating odometry, the other one anticipates the
dynamics of detected things. We establish a leaderboard for this novel task,
and validate a state-of-the-art model that outperforms available baselines.
Related papers
- PREF: Predictability Regularized Neural Motion Fields [68.60019434498703]
Knowing 3D motions in a dynamic scene is essential to many vision applications.
We leverage a neural motion field for estimating the motion of all points in a multiview setting.
We propose to regularize the estimated motion to be predictable.
arXiv Detail & Related papers (2022-09-21T22:32:37Z) - Unified Recurrence Modeling for Video Action Anticipation [16.240254363118016]
We propose a unified recurrence modeling for video action anticipation via message passing framework.
Our proposed method outperforms previous works on the large-scale EPIC-Kitchen dataset.
arXiv Detail & Related papers (2022-06-02T12:16:44Z) - Learning Future Object Prediction with a Spatiotemporal Detection
Transformer [1.1543275835002982]
We train a detection transformer to directly output future objects.
We extend existing transformers in two ways to capture scene dynamics.
Our final approach learns to capture the dynamics and make predictions on par with an oracle for 100 ms prediction horizons.
arXiv Detail & Related papers (2022-04-21T17:58:36Z) - Joint Forecasting of Panoptic Segmentations with Difference Attention [72.03470153917189]
We study a new panoptic segmentation forecasting model that jointly forecasts all object instances in a scene.
We evaluate the proposed model on the Cityscapes and AIODrive datasets.
arXiv Detail & Related papers (2022-04-14T17:59:32Z) - Investigating Pose Representations and Motion Contexts Modeling for 3D
Motion Prediction [63.62263239934777]
We conduct an indepth study on various pose representations with a focus on their effects on the motion prediction task.
We propose a novel RNN architecture termed AHMR (Attentive Hierarchical Motion Recurrent network) for motion prediction.
Our approach outperforms the state-of-the-art methods in short-term prediction and achieves much enhanced long-term prediction proficiency.
arXiv Detail & Related papers (2021-12-30T10:45:22Z) - RAIN: Reinforced Hybrid Attention Inference Network for Motion
Forecasting [34.54878390622877]
We propose a generic motion forecasting framework with dynamic key information selection and ranking based on a hybrid attention mechanism.
The framework is instantiated to handle multi-agent trajectory prediction and human motion forecasting tasks.
We validate the framework on both synthetic simulations and motion forecasting benchmarks in different domains.
arXiv Detail & Related papers (2021-08-03T06:30:30Z) - Self-Supervision by Prediction for Object Discovery in Videos [62.87145010885044]
In this paper, we use the prediction task as self-supervision and build a novel object-centric model for image sequence representation.
Our framework can be trained without the help of any manual annotation or pretrained network.
Initial experiments confirm that the proposed pipeline is a promising step towards object-centric video prediction.
arXiv Detail & Related papers (2021-03-09T19:14:33Z) - Motion Segmentation using Frequency Domain Transformer Networks [29.998917158604694]
We propose a novel end-to-end learnable architecture that predicts the next frame by modeling foreground and background separately.
Our approach can outperform some widely used video prediction methods like Video Ladder Network and Predictive Gated Pyramids on synthetic data.
arXiv Detail & Related papers (2020-04-18T15:05:11Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.