ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting
- URL: http://arxiv.org/abs/2508.07089v1
- Date: Sat, 09 Aug 2025 20:18:10 GMT
- Title: ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting
- Authors: Sandro Papais, Letian Wang, Brian Cheong, Steven L. Waslander,
- Abstract summary: ForeSight is a novel joint detection and forecasting framework for vision-based 3D perception in autonomous vehicles.<n>We show that ForeSight achieves state-of-the-art performance, achieving an EPA of 54.9%, surpassing previous methods by 9.3%.
- Score: 7.401111319849394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce ForeSight, a novel joint detection and forecasting framework for vision-based 3D perception in autonomous vehicles. Traditional approaches treat detection and forecasting as separate sequential tasks, limiting their ability to leverage temporal cues. ForeSight addresses this limitation with a multi-task streaming and bidirectional learning approach, allowing detection and forecasting to share query memory and propagate information seamlessly. The forecast-aware detection transformer enhances spatial reasoning by integrating trajectory predictions from a multiple hypothesis forecast memory queue, while the streaming forecast transformer improves temporal consistency using past forecasts and refined detections. Unlike tracking-based methods, ForeSight eliminates the need for explicit object association, reducing error propagation with a tracking-free model that efficiently scales across multi-frame sequences. Experiments on the nuScenes dataset show that ForeSight achieves state-of-the-art performance, achieving an EPA of 54.9%, surpassing previous methods by 9.3%, while also attaining the best mAP and minADE among multi-view detection and forecasting models.
Related papers
- Streaming Real-Time Trajectory Prediction Using Endpoint-Aware Modeling [54.94692733670454]
Future trajectories of neighboring traffic agents have a significant influence on the path planning and decision-making of autonomous vehicles.<n>We propose a lightweight yet highly accurate streaming-based trajectory forecasting approach.<n>Our approach significantly reduces inference latency, making it well-suited for real-world deployment.
arXiv Detail & Related papers (2026-03-02T13:44:23Z) - MASAR: Motion-Appearance Synergy Refinement for Joint Detection and Trajectory Forecasting [2.681087131751672]
MASAR is a novel framework for joint 3D detection trajectory forecasting compatible with any transformer-based 3D detector.<n>By predicting past trajectories and refining them using guidance from appearance cues, MASAR captures long-term temporal dependencies that enhance future trajectory forecasting.
arXiv Detail & Related papers (2026-02-13T15:11:50Z) - Forward Consistency Learning with Gated Context Aggregation for Video Anomaly Detection [17.79982215633934]
Video anomaly detection (VAD) aims to measure deviations from normal patterns for various events in real-time surveillance systems.<n>Most existing VAD methods rely on large-scale models to pursue extreme accuracy, limiting their feasibility on resource-limited edge devices.<n>We introduce FoGA, a lightweight VAD model that performs Forward consistency learning with Gated context aggregation.
arXiv Detail & Related papers (2026-01-26T04:35:31Z) - Dynamic Aware: Adaptive Multi-Mode Out-of-Distribution Detection for Trajectory Prediction in Autonomous Vehicles [8.920589816043298]
Trajectory prediction is central to the safe and seamless operation of autonomous vehicles.<n>In deployment, prediction models inevitably face distribution shifts between training data and real-world conditions.<n>We propose a new framework that introduces adaptive mechanisms to achieve robust detection in complex driving environments.
arXiv Detail & Related papers (2025-09-16T22:37:21Z) - RealTraj: Towards Real-World Pedestrian Trajectory Forecasting [10.332817296500533]
We propose a novel framework, RealTraj, that enhances the real-world applicability of trajectory forecasting.<n>We present Det2TrajFormer, a model that remains invariant to tracking noise by using past detections as inputs.<n>Unlike previous trajectory forecasting methods, our approach fine-tunes the model using only ground-truth detections, reducing the need for costly person ID annotations.
arXiv Detail & Related papers (2024-11-26T12:35:26Z) - StreamMOTP: Streaming and Unified Framework for Joint 3D Multi-Object Tracking and Trajectory Prediction [22.29257945966914]
We propose a streaming and unified framework for joint 3D Multi-Object Tracking and trajectory Prediction (StreamMOTP)
We construct the model in a streaming manner and exploit a memory bank to preserve and leverage the long-term latent features for tracked objects more effectively.
We also improve the quality and consistency of predicted trajectories with a dual-stream predictor.
arXiv Detail & Related papers (2024-06-28T11:35:35Z) - A Multi-Stage Goal-Driven Network for Pedestrian Trajectory Prediction [6.137256382926171]
This paper proposes a novel method for pedestrian trajectory prediction, called multi-stage goal-driven network (MGNet)
The network comprises three main components: a conditional variational autoencoder (CVAE), an attention module, and a multi-stage goal evaluator.
The effectiveness of MGNet is demonstrated through comprehensive experiments on the JAAD and PIE datasets.
arXiv Detail & Related papers (2024-06-26T03:59:21Z) - Streaming Motion Forecasting for Autonomous Driving [71.7468645504988]
We introduce a benchmark that queries future trajectories on streaming data and we refer to it as "streaming forecasting"
Our benchmark inherently captures the disappearance and re-appearance of agents, which is a safety-critical problem yet overlooked by snapshot-based benchmarks.
We propose a plug-and-play meta-algorithm called "Predictive Streamer" that can adapt any snapshot-based forecaster into a streaming forecaster.
arXiv Detail & Related papers (2023-10-02T17:13:16Z) - Improving Trajectory Prediction in Dynamic Multi-Agent Environment by
Dropping Waypoints [9.385936248154987]
Motion prediction systems must learn spatial and temporal information from the past to forecast the future trajectories of the agent.
We propose Temporal Waypoint Dropping (TWD) that explicitly incorporates temporal dependencies during the training of a trajectory prediction model.
We evaluate our proposed approach on three datasets: NBA Sports VU, ETH-UCY, and TrajNet++.
arXiv Detail & Related papers (2023-09-29T15:48:35Z) - Towards Motion Forecasting with Real-World Perception Inputs: Are
End-to-End Approaches Competitive? [93.10694819127608]
We propose a unified evaluation pipeline for forecasting methods with real-world perception inputs.
Our in-depth study uncovers a substantial performance gap when transitioning from curated to perception-based data.
arXiv Detail & Related papers (2023-06-15T17:03:14Z) - Uncovering the Missing Pattern: Unified Framework Towards Trajectory
Imputation and Prediction [60.60223171143206]
Trajectory prediction is a crucial undertaking in understanding entity movement or human behavior from observed sequences.
Current methods often assume that the observed sequences are complete while ignoring the potential for missing values.
This paper presents a unified framework, the Graph-based Conditional Variational Recurrent Neural Network (GC-VRNN), which can perform trajectory imputation and prediction simultaneously.
arXiv Detail & Related papers (2023-03-28T14:27:27Z) - Trajectory Forecasting from Detection with Uncertainty-Aware Motion
Encoding [121.66374635092097]
Trajectories obtained from object detection and tracking are inevitably noisy.
We propose a trajectory predictor directly based on detection results without relying on explicitly formed trajectories.
arXiv Detail & Related papers (2022-02-03T09:09:56Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z) - Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud
Forecasting for Sequential Pose Forecasting [106.3504366501894]
Self-driving vehicles and robotic manipulation systems often forecast future object poses by first detecting and tracking objects.
This detect-then-forecast pipeline is expensive to scale, as pose forecasting algorithms typically require labeled sequences of object poses.
We propose to first forecast 3D sensor data and then detect/track objects on the predicted point cloud sequences to obtain future poses.
This makes it less expensive to scale pose forecasting, as the sensor data forecasting task requires no labels.
arXiv Detail & Related papers (2020-03-18T17:54:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.