Layout Sequence Prediction From Noisy Mobile Modality
- URL: http://arxiv.org/abs/2310.06138v1
- Date: Mon, 9 Oct 2023 20:32:49 GMT
- Title: Layout Sequence Prediction From Noisy Mobile Modality
- Authors: Haichao Zhang, Yi Xu, Hongsheng Lu, Takayuki Shimizu, Yun Fu
- Abstract summary: Trajectory prediction plays a vital role in understanding pedestrian movement for applications such as autonomous driving and robotics.
Current trajectory prediction models depend on long, complete, and accurately observed sequences from visual modalities.
We propose LTrajDiff, a novel approach that treats objects obstructed or out of sight as equally important as those with fully visible trajectories.
- Score: 53.49649231056857
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Trajectory prediction plays a vital role in understanding pedestrian movement
for applications such as autonomous driving and robotics. Current trajectory
prediction models depend on long, complete, and accurately observed sequences
from visual modalities. Nevertheless, real-world situations often involve
obstructed cameras, missed objects, or objects out of sight due to
environmental factors, leading to incomplete or noisy trajectories. To overcome
these limitations, we propose LTrajDiff, a novel approach that treats objects
obstructed or out of sight as equally important as those with fully visible
trajectories. LTrajDiff utilizes sensor data from mobile phones to surmount
out-of-sight constraints, albeit introducing new challenges such as modality
fusion, noisy data, and the absence of spatial layout and object size
information. We employ a denoising diffusion model to predict precise layout
sequences from noisy mobile data using a coarse-to-fine diffusion strategy,
incorporating the RMS, Siamese Masked Encoding Module, and MFM. Our model
predicts layout sequences by implicitly inferring object size and projection
status from a single reference timestamp or significantly obstructed sequences.
Achieving SOTA results in randomly obstructed experiments and extremely short
input experiments, our model illustrates the effectiveness of leveraging noisy
mobile data. In summary, our approach offers a promising solution to the
challenges faced by layout sequence and trajectory prediction models in
real-world settings, paving the way for utilizing sensor data from mobile
phones to accurately predict pedestrian bounding box trajectories. To the best
of our knowledge, this is the first work that addresses severely obstructed and
extremely short layout sequences by combining vision with noisy mobile
modality, making it the pioneering work in the field of layout sequence
trajectory prediction.
Related papers
- OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - MSTF: Multiscale Transformer for Incomplete Trajectory Prediction [30.152217860860464]
We propose an end-to-end framework, termed Multiscale Transformer (MSTF), meticulously crafted for incomplete trajectory prediction.
MSTF integrates a Multiscale Attention Head (MAH) and an Information Increment-based Pattern Adaptive (IIPA) module.
We evaluate our proposed MSTF model using two large-scale real-world datasets.
arXiv Detail & Related papers (2024-07-08T07:10:17Z) - OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising [49.86409475232849]
Trajectory prediction is fundamental in computer vision and autonomous driving.
Existing approaches in this field often assume precise and complete observational data.
We present a novel method for out-of-sight trajectory prediction that leverages a vision-positioning technique.
arXiv Detail & Related papers (2024-04-02T18:30:29Z) - Controllable Diverse Sampling for Diffusion Based Motion Behavior
Forecasting [11.106812447960186]
We introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT)
CDT integrates information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories.
To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left.
arXiv Detail & Related papers (2024-02-06T13:16:54Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - Uncovering the Missing Pattern: Unified Framework Towards Trajectory
Imputation and Prediction [60.60223171143206]
Trajectory prediction is a crucial undertaking in understanding entity movement or human behavior from observed sequences.
Current methods often assume that the observed sequences are complete while ignoring the potential for missing values.
This paper presents a unified framework, the Graph-based Conditional Variational Recurrent Neural Network (GC-VRNN), which can perform trajectory imputation and prediction simultaneously.
arXiv Detail & Related papers (2023-03-28T14:27:27Z) - PDFormer: Propagation Delay-Aware Dynamic Long-Range Transformer for
Traffic Flow Prediction [78.05103666987655]
spatial-temporal Graph Neural Network (GNN) models have emerged as one of the most promising methods to solve this problem.
We propose a novel propagation delay-aware dynamic long-range transFormer, namely PDFormer, for accurate traffic flow prediction.
Our method can not only achieve state-of-the-art performance but also exhibit competitive computational efficiency.
arXiv Detail & Related papers (2023-01-19T08:42:40Z) - MissFormer: (In-)attention-based handling of missing observations for
trajectory filtering and prediction [11.241614693184323]
This paper introduces a transformer-based approach for handling missing observations in variable input length trajectory data.
By providing missing tokens, binary-encoded missing events, the model learns to in-attend to missing data and infers a complete trajectory conditioned on the remaining inputs.
arXiv Detail & Related papers (2021-06-30T12:12:52Z) - Ellipse Loss for Scene-Compliant Motion Prediction [12.446392441065065]
We propose a novel ellipse loss that allows the models to better reason about scene compliance and predict more realistic trajectories.
Ellipse loss penalizes off-road predictions directly in a supervised manner, by projecting the output trajectories into the top-down map frame.
It takes into account actor dimensions and orientation, providing more direct training signals to the model.
arXiv Detail & Related papers (2020-11-05T23:33:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.