MulCPred: Learning Multi-modal Concepts for Explainable Pedestrian Action Prediction
- URL: http://arxiv.org/abs/2409.09446v1
- Date: Sat, 14 Sep 2024 14:15:28 GMT
- Title: MulCPred: Learning Multi-modal Concepts for Explainable Pedestrian Action Prediction
- Authors: Yan Feng, Alexander Carballo, Keisuke Fujii, Robin Karlsson, Ming Ding, Kazuya Takeda,
- Abstract summary: MulCPred is proposed that explains its predictions based on multi-modal concepts represented by training samples.
MulCPred is evaluated on multiple datasets and tasks.
- Score: 57.483718822429346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pedestrian action prediction is of great significance for many applications such as autonomous driving. However, state-of-the-art methods lack explainability to make trustworthy predictions. In this paper, a novel framework called MulCPred is proposed that explains its predictions based on multi-modal concepts represented by training samples. Previous concept-based methods have limitations including: 1) they cannot directly apply to multi-modal cases; 2) they lack locality to attend to details in the inputs; 3) they suffer from mode collapse. These limitations are tackled accordingly through the following approaches: 1) a linear aggregator to integrate the activation results of the concepts into predictions, which associates concepts of different modalities and provides ante-hoc explanations of the relevance between the concepts and the predictions; 2) a channel-wise recalibration module that attends to local spatiotemporal regions, which enables the concepts with locality; 3) a feature regularization loss that encourages the concepts to learn diverse patterns. MulCPred is evaluated on multiple datasets and tasks. Both qualitative and quantitative results demonstrate that MulCPred is promising in improving the explainability of pedestrian action prediction without obvious performance degradation. Furthermore, by removing unrecognizable concepts from MulCPred, the cross-dataset prediction performance is improved, indicating the feasibility of further generalizability of MulCPred.
Related papers
- Sparse Prototype Network for Explainable Pedestrian Behavior Prediction [60.80524827122901]
We present Sparse Prototype Network (SPN), an explainable method designed to simultaneously predict a pedestrian's future action, trajectory, and pose.
Regularized by mono-semanticity and clustering constraints, the prototypes learn consistent and human-understandable features.
arXiv Detail & Related papers (2024-10-16T03:33:40Z) - Interpretable Prognostics with Concept Bottleneck Models [5.939858158928473]
Concept Bottleneck Models (CBMs) are inherently interpretable neural network architectures based on concept explanations.
CBMs enable domain experts to intervene on the concept activations at test-time.
Our case studies demonstrate that the performance of CBMs can be on par or superior to black-box models.
arXiv Detail & Related papers (2024-05-27T18:15:40Z) - Knowledge-Aware Neuron Interpretation for Scene Classification [32.32713349524347]
We propose a knowledge-aware neuron interpretation framework to explain model predictions for image scene classification.
For concept completeness, we present core concepts of a scene based on knowledge graph, ConceptNet, to gauge the completeness of concepts.
For concept fusion, we introduce a knowledge graph-based method known as Concept Filtering, which produces over 23% point gain on neuron behaviors for neuron interpretation.
arXiv Detail & Related papers (2024-01-29T01:00:17Z) - Beyond Concept Bottleneck Models: How to Make Black Boxes Intervenable? [8.391254800873599]
We introduce a method to perform concept-based interventions on pretrained neural networks, which are not interpretable by design.
We formalise the notion of intervenability as a measure of the effectiveness of concept-based interventions and leverage this definition to fine-tune black boxes.
arXiv Detail & Related papers (2024-01-24T16:02:14Z) - Do Concept Bottleneck Models Respect Localities? [14.77558378567965]
Concept-based methods explain model predictions using human-understandable concepts.
"Localities" involve using only relevant features when predicting a concept's value.
CBMs may not capture localities, even when independent concepts are localised to non-overlapping feature subsets.
arXiv Detail & Related papers (2024-01-02T16:05:23Z) - Improving Language Models Meaning Understanding and Consistency by
Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness.
A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results.
We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z) - Promises and Pitfalls of Black-Box Concept Learning Models [26.787383014558802]
We show that machine learning models that incorporate concept learning encode information beyond the pre-defined concepts.
Natural mitigation strategies do not fully work, rendering the interpretation of the downstream prediction misleading.
arXiv Detail & Related papers (2021-06-24T21:00:28Z) - Interpretable Social Anchors for Human Trajectory Forecasting in Crowds [84.20437268671733]
We propose a neural network-based system to predict human trajectory in crowds.
We learn interpretable rule-based intents, and then utilise the expressibility of neural networks to model scene-specific residual.
Our architecture is tested on the interaction-centric benchmark TrajNet++.
arXiv Detail & Related papers (2021-05-07T09:22:34Z) - Three Steps to Multimodal Trajectory Prediction: Modality Clustering,
Classification and Synthesis [54.249502356251085]
We present a novel insight along with a brand-new prediction framework.
Our proposed method surpasses state-of-the-art works even without introducing social and map information.
arXiv Detail & Related papers (2021-03-14T06:21:03Z) - Dynamic Inference: A New Approach Toward Efficient Video Action
Recognition [69.9658249941149]
Action recognition in videos has achieved great success recently, but it remains a challenging task due to the massive computational cost.
We propose a general dynamic inference idea to improve inference efficiency by leveraging the variation in the distinguishability of different videos.
arXiv Detail & Related papers (2020-02-09T11:09:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.