Pedestrian Behavior Prediction via Multitask Learning and Categorical
Interaction Modeling
- URL: http://arxiv.org/abs/2012.03298v1
- Date: Sun, 6 Dec 2020 15:57:11 GMT
- Title: Pedestrian Behavior Prediction via Multitask Learning and Categorical
Interaction Modeling
- Authors: Amir Rasouli and Mohsen Rohani and Jun Luo
- Abstract summary: We propose a multitask learning framework that simultaneously predicts trajectories and actions of pedestrians by relying on multimodal data.
We show that our model achieves state-of-the-art performance and improves trajectory and action prediction by up to 22% and 6% respectively.
- Score: 13.936894582450734
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pedestrian behavior prediction is one of the major challenges for intelligent
driving systems. Pedestrians often exhibit complex behaviors influenced by
various contextual elements. To address this problem, we propose a multitask
learning framework that simultaneously predicts trajectories and actions of
pedestrians by relying on multimodal data. Our method benefits from 1) a hybrid
mechanism to encode different input modalities independently allowing them to
develop their own representations, and jointly to produce a representation for
all modalities using shared parameters; 2) a novel interaction modeling
technique that relies on categorical semantic parsing of the scenes to capture
interactions between target pedestrians and their surroundings; and 3) a dual
prediction mechanism that uses both independent and shared decoding of
multimodal representations. Using public pedestrian behavior benchmark datasets
for driving, PIE and JAAD, we highlight the benefits of multitask learning for
behavior prediction and show that our model achieves state-of-the-art
performance and improves trajectory and action prediction by up to 22% and 6%
respectively. We further investigate the contributions of the proposed
processing and interaction modeling techniques via extensive ablation studies.
Related papers
- DeepInteraction++: Multi-Modality Interaction for Autonomous Driving [80.8837864849534]
We introduce a novel modality interaction strategy that allows individual per-modality representations to be learned and maintained throughout.
DeepInteraction++ is a multi-modal interaction framework characterized by a multi-modal representational interaction encoder and a multi-modal predictive interaction decoder.
Experiments demonstrate the superior performance of the proposed framework on both 3D object detection and end-to-end autonomous driving tasks.
arXiv Detail & Related papers (2024-08-09T14:04:21Z) - Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - PedFormer: Pedestrian Behavior Prediction via Cross-Modal Attention
Modulation and Gated Multitask Learning [10.812772606528172]
We propose a novel framework that relies on different data modalities to predict future trajectories and crossing actions of pedestrians from an ego-centric perspective.
We show that our model improves state-of-the-art in trajectory and action prediction by up to 22% and 13% respectively on various metrics.
arXiv Detail & Related papers (2022-10-14T15:12:00Z) - ProspectNet: Weighted Conditional Attention for Future Interaction
Modeling in Behavior Prediction [5.520507323174275]
We formulate the end-to-end joint prediction problem as a sequential learning process of marginal learning and joint learning of vehicle behaviors.
We propose ProspectNet, a joint learning block that adopts the weighted attention score to model the mutual influence between interactive agent pairs.
We show that ProspectNet outperforms the Cartesian product of two marginal predictions, and achieves comparable performance on the Interactive Motion Prediction benchmarks.
arXiv Detail & Related papers (2022-08-29T19:29:49Z) - Coarse-to-Fine Knowledge-Enhanced Multi-Interest Learning Framework for
Multi-Behavior Recommendation [52.89816309759537]
Multi-types of behaviors (e.g., clicking, adding to cart, purchasing, etc.) widely exist in most real-world recommendation scenarios.
The state-of-the-art multi-behavior models learn behavior dependencies indistinguishably with all historical interactions as input.
We propose a novel Coarse-to-fine Knowledge-enhanced Multi-interest Learning framework to learn shared and behavior-specific interests for different behaviors.
arXiv Detail & Related papers (2022-08-03T05:28:14Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Learning Modality Interaction for Temporal Sentence Localization and
Event Captioning in Videos [76.21297023629589]
We propose a novel method for learning pairwise modality interactions in order to better exploit complementary information for each pair of modalities in videos.
Our method turns out to achieve state-of-the-art performances on four standard benchmark datasets.
arXiv Detail & Related papers (2020-07-28T12:40:59Z) - SMART: Simultaneous Multi-Agent Recurrent Trajectory Prediction [72.37440317774556]
We propose advances that address two key challenges in future trajectory prediction.
multimodality in both training data and predictions and constant time inference regardless of number of agents.
arXiv Detail & Related papers (2020-07-26T08:17:10Z) - EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational
Reasoning [41.42230144157259]
We propose a generic trajectory forecasting framework with explicit relational structure recognition and prediction via latent interaction graphs.
Considering the uncertainty of future behaviors, the model is designed to provide multi-modal prediction hypotheses.
We introduce a double-stage training pipeline which not only improves training efficiency and accelerates convergence, but also enhances model performance.
arXiv Detail & Related papers (2020-03-31T02:49:23Z) - Collaborative Motion Prediction via Neural Motion Message Passing [37.72454920355321]
We propose neural motion message passing (NMMP) to explicitly model the interaction and learn representations for directed interactions between actors.
Based on the proposed NMMP, we design the motion prediction systems for two settings: the pedestrian setting and the joint pedestrian and vehicle setting.
Both systems outperform the previous state-of-the-art methods on several existing benchmarks.
arXiv Detail & Related papers (2020-03-14T10:12:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.