Related papers: Distilling Knowledge for Short-to-Long Term Trajectory Prediction

Distilling Knowledge for Short-to-Long Term Trajectory Prediction

URL: http://arxiv.org/abs/2305.08553v4
Date: Tue, 3 Sep 2024 14:56:48 GMT
Title: Distilling Knowledge for Short-to-Long Term Trajectory Prediction
Authors: Sourav Das, Guglielmo Camporese, Shaokang Cheng, Lamberto Ballan,
Abstract summary: Long-term trajectory forecasting is an important problem in the fields of computer vision, machine learning, and robotics. We propose Di-Long, a new method that employs the distillation of a short-term trajectory model forecaster that guides a student network for long-term trajectory prediction. Our experiments show that our proposed Di-Long method is effective for long-term forecasting and achieves state-of-the-art performance.
Score: 9.626916225081613
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Long-term trajectory forecasting is an important and challenging problem in the fields of computer vision, machine learning, and robotics. One fundamental difficulty stands in the evolution of the trajectory that becomes more and more uncertain and unpredictable as the time horizon grows, subsequently increasing the complexity of the problem. To overcome this issue, in this paper, we propose Di-Long, a new method that employs the distillation of a short-term trajectory model forecaster that guides a student network for long-term trajectory prediction during the training process. Given a total sequence length that comprehends the allowed observation for the student network and the complementary target sequence, we let the student and the teacher solve two different related tasks defined over the same full trajectory: the student observes a short sequence and predicts a long trajectory, whereas the teacher observes a longer sequence and predicts the remaining short target trajectory. The teacher's task is less uncertain, and we use its accurate predictions to guide the student through our knowledge distillation framework, reducing long-term future uncertainty. Our experiments show that our proposed Di-Long method is effective for long-term forecasting and achieves state-of-the-art performance on the Intersection Drone Dataset (inD) and the Stanford Drone Dataset (SDD).

Related papers

Progressive Pretext Task Learning for Human Trajectory Prediction [44.07301075351432]
We introduce a novel Progressive Pretext Task learning (PPT) framework, which progressively enhances the model's capacity of capturing short-term dynamics and long-term dependencies. We design a Transformer-based trajectory predictor, which is able to achieve highly efficient two-step reasoning.
arXiv Detail & Related papers (2024-07-16T10:48:18Z)
Self-Supervised Contrastive Learning for Long-term Forecasting [41.11757636744812]
Long-term forecasting presents unique challenges due to the time and memory complexity. Existing methods, which rely on sliding windows to process long sequences, struggle to effectively capture long-term variations. We introduce a novel approach that overcomes this limitation by employing contrastive learning and enhanced decomposition architecture.
arXiv Detail & Related papers (2024-02-03T04:32:34Z)
Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction [15.696593695918844]
This paper introduces a novel self-supervised video strategy for enhancing action prediction inspired by DINO (self-distillation with no labels) The experimental results showcase significant improvements in prediction performance across 3D-ResNet, Transformer, and LSTM architectures. These findings highlight the potential of our approach in diverse video-based tasks such as activity recognition, motion planning, and scene understanding.
arXiv Detail & Related papers (2023-08-08T21:18:23Z)
Multiscale Video Pretraining for Long-Term Activity Forecasting [67.06864386274736]
Multiscale Video Pretraining learns robust representations for forecasting by learning to predict contextualized representations of future video clips over multiple timescales. MVP is based on our observation that actions in videos have a multiscale nature, where atomic actions typically occur at a short timescale and more complex actions may span longer timescales. Our comprehensive experiments across the Ego4D and Epic-Kitchens-55/100 datasets demonstrate that MVP out-performs state-of-the-art methods by significant margins.
arXiv Detail & Related papers (2023-07-24T14:55:15Z)
Improving Long-Horizon Imitation Through Instruction Prediction [93.47416552953075]
In this work, we explore the use of an often unused source of auxiliary supervision: language. Inspired by recent advances in transformer-based models, we train agents with an instruction prediction loss that encourages learning temporally extended representations that operate at a high level of abstraction. In further analysis we find that instruction modeling is most important for tasks that require complex reasoning, while understandably offering smaller gains in environments that require simple plans.
arXiv Detail & Related papers (2023-06-21T20:47:23Z)
MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking [56.92165669843006]
We propose MotionTrack, which learns robust short-term and long-term motions in a unified framework to associate trajectories from a short to long range. For dense crowds, we design a novel Interaction Module to learn interaction-aware motions from short-term trajectories, which can estimate the complex movement of each target. For extreme occlusions, we build a novel Refind Module to learn reliable long-term motions from the target's history trajectory, which can link the interrupted trajectory with its corresponding detection.
arXiv Detail & Related papers (2023-03-18T12:38:33Z)
From Goals, Waypoints & Paths To Long Term Human Trajectory Forecasting [54.273455592965355]
Uncertainty in future trajectories stems from two sources: (a) sources known to the agent but unknown to the model, such as long term goals and (b)sources that are unknown to both the agent & the model, such as intent of other agents & irreducible randomness indecisions. We model the epistemic un-certainty through multimodality in long term goals and the aleatoric uncertainty through multimodality in waypoints& paths. To exemplify this dichotomy, we also propose a novel long term trajectory forecasting setting, with prediction horizons upto a minute, an order of magnitude longer than prior works.
arXiv Detail & Related papers (2020-12-02T21:01:29Z)
LTN: Long-Term Network for Long-Term Motion Prediction [0.0]
We present a two-stage framework for long-term trajectory prediction, which is named as Long-Term Network (LTN) We first generate a set of proposed trajectories with our proposed distribution using a Conditional Variational Autoencoder (CVAE) and then classify them with binary labels, and output the trajectories with the highest score. The results show that our method outperforms multiple state-of-the-art approaches in long-term trajectory prediction in terms of accuracy.
arXiv Detail & Related papers (2020-10-15T17:59:09Z)
Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors [124.30562402952319]
The ability to predict and plan into the future is fundamental for agents acting in the world. Current learning approaches for visual prediction and planning fail on long-horizon tasks. We propose a framework for visual prediction and planning that is able to overcome both of these limitations.
arXiv Detail & Related papers (2020-06-23T17:58:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.