Explainable Object-induced Action Decision for Autonomous Vehicles
- URL: http://arxiv.org/abs/2003.09405v1
- Date: Fri, 20 Mar 2020 17:33:44 GMT
- Title: Explainable Object-induced Action Decision for Autonomous Vehicles
- Authors: Yiran Xu, Xiaoyin Yang, Lihang Gong, Hsuan-Chu Lin, Tz-Ying Wu,
Yunsheng Li, Nuno Vasconcelos
- Abstract summary: A new paradigm is proposed for autonomous driving.
It is inspired by how humans solve the problem.
A CNN architecture is proposed to solve this problem.
- Score: 53.59781838748779
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A new paradigm is proposed for autonomous driving. The new paradigm lies
between the end-to-end and pipelined approaches, and is inspired by how humans
solve the problem. While it relies on scene understanding, the latter only
considers objects that could originate hazard. These are denoted as
action-inducing, since changes in their state should trigger vehicle actions.
They also define a set of explanations for these actions, which should be
produced jointly with the latter. An extension of the BDD100K dataset,
annotated for a set of 4 actions and 21 explanations, is proposed. A new
multi-task formulation of the problem, which optimizes the accuracy of both
action commands and explanations, is then introduced. A CNN architecture is
finally proposed to solve this problem, by combining reasoning about action
inducing objects and global scene context. Experimental results show that the
requirement of explanations improves the recognition of action-inducing
objects, which in turn leads to better action predictions.
Related papers
- ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions [66.20773952864802]
We develop a dataset consisting of 8.5k images and 59.3k inferences about actions grounded in those images.
We propose ActionCOMET, a framework to discern knowledge present in language models specific to the provided visual input.
arXiv Detail & Related papers (2024-10-17T15:22:57Z) - Towards Explainable Motion Prediction using Heterogeneous Graph
Representations [3.675875935838632]
Motion prediction systems aim to capture the future behavior of traffic scenarios enabling autonomous vehicles to perform safe and efficient planning.
GNN-based approaches have recently gained attention as they are well suited to naturally model these interactions.
In this work, we aim to improve the explainability of motion prediction systems by using different approaches.
arXiv Detail & Related papers (2022-12-07T17:43:42Z) - OCTET: Object-aware Counterfactual Explanations [29.532969342297086]
We propose an object-centric framework for counterfactual explanation generation.
Our method, inspired by recent generative modeling works, encodes the query image into a latent space that is structured to ease object-level manipulations.
We conduct a set of experiments on counterfactual explanation benchmarks for driving scenes, and we show that our method can be adapted beyond classification.
arXiv Detail & Related papers (2022-11-22T16:23:12Z) - H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding
Object Articulations from Interactions [62.510951695174604]
"Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR) is a probabilistic generative framework that generates hypotheses about how objects articulate given input observations.
We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework.
We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
arXiv Detail & Related papers (2022-10-22T18:39:33Z) - Video Action Detection: Analysing Limitations and Challenges [70.01260415234127]
We analyze existing datasets on video action detection and discuss their limitations.
We perform a biasness study which analyzes a key property differentiating videos from static images: the temporal aspect.
Such extreme experiments show existence of biases which have managed to creep into existing methods inspite of careful modeling.
arXiv Detail & Related papers (2022-04-17T00:42:14Z) - Suspected Object Matters: Rethinking Model's Prediction for One-stage
Visual Grounding [93.82542533426766]
We propose a Suspected Object Transformation mechanism (SOT) to encourage the target object selection among the suspected ones.
SOT can be seamlessly integrated into existing CNN and Transformer-based one-stage visual grounders.
Extensive experiments demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2022-03-10T06:41:07Z) - Look Wide and Interpret Twice: Improving Performance on Interactive
Instruction-following Tasks [29.671268927569063]
Recent studies have tackled the problem using ALFRED, a well-designed dataset for the task.
This paper proposes a new method, which outperforms the previous methods by a large margin.
arXiv Detail & Related papers (2021-06-01T16:06:09Z) - Self-Supervision by Prediction for Object Discovery in Videos [62.87145010885044]
In this paper, we use the prediction task as self-supervision and build a novel object-centric model for image sequence representation.
Our framework can be trained without the help of any manual annotation or pretrained network.
Initial experiments confirm that the proposed pipeline is a promising step towards object-centric video prediction.
arXiv Detail & Related papers (2021-03-09T19:14:33Z) - Object and Relation Centric Representations for Push Effect Prediction [18.990827725752496]
Pushing is an essential non-prehensile manipulation skill used for tasks ranging from pre-grasp manipulation to scene rearrangement.
We propose a graph neural network based framework for effect prediction and parameter estimation of pushing actions.
Our framework is validated both in real and simulated environments containing different shaped multi-part objects connected via different types of joints and objects with different masses.
arXiv Detail & Related papers (2021-02-03T15:09:12Z) - Algorithmic Recourse: from Counterfactual Explanations to Interventions [16.9979815165902]
We argue that counterfactual explanations inform an individual where they need to get to, but not how to get there.
Instead, we propose a shift of paradigm from recourse via nearest counterfactual explanations to recourse through minimal interventions.
arXiv Detail & Related papers (2020-02-14T22:49:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.