Pishgu: Universal Path Prediction Architecture through Graph Isomorphism
and Attentive Convolution
- URL: http://arxiv.org/abs/2210.08057v1
- Date: Fri, 14 Oct 2022 18:48:48 GMT
- Title: Pishgu: Universal Path Prediction Architecture through Graph Isomorphism
and Attentive Convolution
- Authors: Ghazal Alinezhad Noghre, Vinit Katariya, Armin Danesh Pazho,
Christopher Neff, Hamed Tabkhi
- Abstract summary: This article proposes Pishgu, a universal graph isomorphism approach for attentive path prediction.
Pishgu captures the inter-dependencies within the subjects in each frame by taking advantage of Graph Isomorphism Networks.
We evaluate the adaptability of our approach to multiple publicly available vehicle (bird's-eye view) and pedestrian (bird's-eye and high-angle view) path prediction datasets.
- Score: 2.6774008509840996
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Path prediction is an essential task for several real-world real-time
applications, from autonomous driving and video surveillance to environmental
monitoring. Most existing approaches are computation-intensive and only target
a narrow domain (e.g., a specific point of view for a particular subject).
However, many real-time applications demand a universal path predictor that can
work across different subjects (vehicles, pedestrians), perspectives
(bird's-eye, high-angle), and scenes (sidewalk, highway). This article proposes
Pishgu, a universal graph isomorphism approach for attentive path prediction
that accounts for environmental challenges. Pishgu captures the
inter-dependencies within the subjects in each frame by taking advantage of
Graph Isomorphism Networks. In addition, an attention module is adopted to
represent the intrinsic relations of the subjects of interest with their
surroundings. We evaluate the adaptability of our approach to multiple publicly
available vehicle (bird's-eye view) and pedestrian (bird's-eye and high-angle
view) path prediction datasets. Pishgu's universal solution outperforms
existing domain-focused methods by producing state-of-the-art results for
vehicle bird's-eye view by 42% and 61% and pedestrian high-angle views by 23%
and 22% in terms of ADE and FDE, respectively. Moreover, we analyze the
domain-specific details for various datasets to understand their effect on path
prediction and model interpretation. Although our model is a single solution
for path prediction problems and defines a new standard in multiple domains, it
still has a comparable complexity to state-of-the-art models, which makes it
suitable for real-world application. We also report the latency and throughput
for all three domains on multiple embedded processors.
Related papers
- Probing Fine-Grained Action Understanding and Cross-View Generalization of Foundation Models [13.972809192907931]
Foundation models (FMs) are large neural networks trained on broad datasets.
Human activity recognition in video has advanced with FMs, driven by competition among different architectures.
This paper empirically evaluates how perspective changes affect different FMs in fine-grained human activity recognition.
arXiv Detail & Related papers (2024-07-22T12:59:57Z) - XVTP3D: Cross-view Trajectory Prediction Using Shared 3D Queries for
Autonomous Driving [7.616422495497465]
Trajectory prediction with uncertainty is a critical and challenging task for autonomous driving.
We present a cross-view trajectory prediction method using shared 3D queries (XVTP3D)
The results of experiments on two publicly available datasets show that XVTP3D achieved state-of-the-art performance with consistent cross-view predictions.
arXiv Detail & Related papers (2023-08-17T03:35:13Z) - MultiPath++: Efficient Information Fusion and Trajectory Aggregation for
Behavior Prediction [42.563865078323204]
We present MultiPath++, a future prediction model that achieves state-of-the-art performance on popular benchmarks.
We show that our proposed model achieves state-of-the-art performance on the Argoverse Motion Forecasting Competition and Open Motion Prediction Challenge.
arXiv Detail & Related papers (2021-11-29T21:36:53Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes.
We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way.
We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z) - Multi-Modal Hybrid Architecture for Pedestrian Action Prediction [14.032334569498968]
We propose a novel multi-modal prediction algorithm that incorporates different sources of information captured from the environment to predict future crossing actions of pedestrians.
Using the existing 2D pedestrian behavior benchmarks and a newly annotated 3D driving dataset, we show that our proposed model achieves state-of-the-art performance in pedestrian crossing prediction.
arXiv Detail & Related papers (2020-11-16T15:17:58Z) - Multi-path Neural Networks for On-device Multi-domain Visual
Classification [55.281139434736254]
This paper proposes a novel approach to automatically learn a multi-path network for multi-domain visual classification on mobile devices.
The proposed multi-path network is learned from neural architecture search by applying one reinforcement learning controller for each domain to select the best path in the super-network created from a MobileNetV3-like search space.
The determined multi-path model selectively shares parameters across domains in shared nodes while keeping domain-specific parameters within non-shared nodes in individual domain paths.
arXiv Detail & Related papers (2020-10-10T05:13:49Z) - Cross-Domain Facial Expression Recognition: A Unified Evaluation
Benchmark and Adversarial Graph Learning [85.6386289476598]
We develop a novel adversarial graph representation adaptation (AGRA) framework for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair evaluations on several popular benchmarks and show that the proposed AGRA framework outperforms previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T15:00:31Z) - Adversarial Bipartite Graph Learning for Video Domain Adaptation [50.68420708387015]
Domain adaptation techniques, which focus on adapting models between distributionally different domains, are rarely explored in the video recognition area.
Recent works on visual domain adaptation which leverage adversarial learning to unify the source and target video representations are not highly effective on the videos.
This paper proposes an Adversarial Bipartite Graph (ABG) learning framework which directly models the source-target interactions.
arXiv Detail & Related papers (2020-07-31T03:48:41Z) - STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and
Trajectory Prediction [24.855059537779294]
We present a novel end-to-end two-stage network: Spatio--Interactive Network (STINet)
In addition to 3D geometry of pedestrians, we model temporal information for each of the pedestrians.
Our method predicts both current and past locations in the first stage, so that each pedestrian can be linked across frames.
arXiv Detail & Related papers (2020-05-08T18:43:01Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.