Related papers: Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent

Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent

URL: http://arxiv.org/abs/2405.17680v1
Date: Mon, 27 May 2024 22:15:23 GMT
Title: Deciphering Movement: Unified Trajectory Generation Model for Multi-Agent
Authors: Yi Xu, Yun Fu,
Abstract summary: We propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs. Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction. We benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation.
Score: 53.637837706712794
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Understanding multi-agent behavior is critical across various fields. The conventional approach involves analyzing agent movements through three primary tasks: trajectory prediction, imputation, and spatial-temporal recovery. Considering the unique input formulation and constraint of these tasks, most existing methods are tailored to address only one specific task. However, in real-world applications, these scenarios frequently occur simultaneously. Consequently, methods designed for one task often fail to adapt to others, resulting in performance drops. To overcome this limitation, we propose a Unified Trajectory Generation model, UniTraj, that processes arbitrary trajectories as masked inputs, adaptable to diverse scenarios. Specifically, we introduce a Ghost Spatial Masking (GSM) module embedded within a Transformer encoder for spatial feature extraction. We further extend recent successful State Space Models (SSMs), particularly the Mamba model, into a Bidirectional Temporal Mamba to effectively capture temporal dependencies. Additionally, we incorporate a Bidirectional Temporal Scaled (BTS) module to comprehensively scan trajectories while maintaining the temporal missing relationships within the sequence. We curate and benchmark three practical sports game datasets, Basketball-U, Football-U, and Soccer-U, for evaluation. Extensive experiments demonstrate the superior performance of our model. To the best of our knowledge, this is the first work that addresses this unified problem through a versatile generative framework, thereby enhancing our understanding of multi-agent movement. Our datasets, code, and model weights are available at https://github.com/colorfulfuture/UniTraj-pytorch.

Related papers

UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines [64.84631333071728]
We introduce bfUnistage, a unified Transformer-based framework fortemporal modeling. Our work demonstrates that a task-specific vision-text can build a generalizable model fortemporal learning. We also introduce a temporal module to incorporate temporal dynamics explicitly.
arXiv Detail & Related papers (2025-03-26T17:33:23Z)
TacticExpert: Spatial-Temporal Graph Language Model for Basketball Tactics [0.0]
Basketball tactic modeling needs to efficiently extract complex spatial-temporal dependencies from historical data. Existing state-of-the-art (SOTA) models, primarily based on graph neural networks (GNNs), encounter difficulties in capturing long-term, long-distance, and fine-grained interactions.
arXiv Detail & Related papers (2025-03-13T08:27:24Z)
Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training. Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z)
TranSPORTmer: A Holistic Approach to Trajectory Understanding in Multi-Agent Sports [28.32714256545306]
TranSPORTmer is a unified transformer-based framework capable of addressing all these tasks. It effectively captures temporal dynamics and social interactions in an equivariant manner. It outperforms state-of-the-art task-specific models in player forecasting, player forecasting-imputation, ball inference, and ball imputation.
arXiv Detail & Related papers (2024-10-23T11:35:44Z)
TIMBA: Time series Imputation with Bi-directional Mamba Blocks and Diffusion models [0.0]
We propose replacing time-oriented Transformers with State-Space Models (SSM) We develop a model that integrates SSM, Graph Neural Networks, and node-oriented Transformers to achieve enhanced representations.
arXiv Detail & Related papers (2024-10-08T11:10:06Z)
MambaTrack: A Simple Baseline for Multiple Object Tracking with State Space Model [18.607106274732885]
We introduce a Mamba-based motion model named Mamba moTion Predictor (MTP) MTP takes the spatial-temporal location dynamics of objects as input, captures the motion pattern using a bi-Mamba encoding layer, and predicts the next motion. Our proposed tracker, MambaTrack, demonstrates advanced performance on benchmarks such as Dancetrack and SportsMOT.
arXiv Detail & Related papers (2024-08-17T11:58:47Z)
DeTra: A Unified Model for Object Detection and Trajectory Forecasting [68.85128937305697]
Our approach formulates the union of the two tasks as a trajectory refinement problem. To tackle this unified task, we design a refinement transformer that infers the presence, pose, and multi-modal future behaviors of objects. In our experiments, we observe that ourmodel outperforms the state-of-the-art on Argoverse 2 Sensor and Open dataset.
arXiv Detail & Related papers (2024-06-06T18:12:04Z)
Modeling Continuous Motion for 3D Point Cloud Object Tracking [54.48716096286417]
This paper presents a novel approach that views each tracklet as a continuous stream. At each timestamp, only the current frame is fed into the network to interact with multi-frame historical features stored in a memory bank. To enhance the utilization of multi-frame features for robust tracking, a contrastive sequence enhancement strategy is proposed.
arXiv Detail & Related papers (2023-03-14T02:58:27Z)
Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D Pose Estimation Tracking and Forecasting on a Video Snippet [24.852728097115744]
Multi-person pose understanding from RGB involves three complex tasks: pose estimation, tracking and motion forecasting. Most existing works either focus on a single task or employ multi-stage approaches to solving multiple tasks separately. We propose Snipper, a unified framework to perform multi-person 3D pose estimation, tracking, and motion forecasting simultaneously in a single stage.
arXiv Detail & Related papers (2022-07-09T18:42:14Z)
Learning Behavior Representations Through Multi-Timescale Bootstrapping [8.543808476554695]
We introduce Bootstrap Across Multiple Scales (BAMS), a multi-scale representation learning model for behavior. We first apply our method on a dataset of quadrupeds navigating in different terrain types, and show that our model captures the temporal complexity of behavior.
arXiv Detail & Related papers (2022-06-14T17:57:55Z)
Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking [59.79252390626194]
We propose a novel solution named TransSTAM, which leverages Transformer to model both the appearance features of each object and the spatial-temporal relationships among objects. The proposed method is evaluated on multiple public benchmarks including MOT16, MOT17, and MOT20, and it achieves a clear performance improvement in both IDF1 and HOTA.
arXiv Detail & Related papers (2022-05-31T01:19:18Z)
P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation [78.83305967085413]
This paper introduces a novel Pre-trained Spatial Temporal Many-to-One (P-STMO) model for 2D-to-3D human pose estimation task. Our method outperforms state-of-the-art methods with fewer parameters and less computational overhead.
arXiv Detail & Related papers (2022-03-15T04:00:59Z)
baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotemporal Modeling [17.352818121007576]
Multi-agenttemporal modeling is a challenging task from both an algorithmic design perspective and computational perspective. We introduce baller2vec, a multi-entity generalization of the standard Transformer that can simultaneously integrate information across entities and time. We test the effectiveness of baller2vec for multi-agenttemporal modeling by training it to perform two different basketball-related tasks.
arXiv Detail & Related papers (2021-02-05T17:02:04Z)
SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction. multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z)
A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC) First, spatial-temporal attention mechanism is presented to explore the most useful and important information. Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.