Interpreting Neural Policies with Disentangled Tree Representations
- URL: http://arxiv.org/abs/2210.06650v2
- Date: Sun, 12 Nov 2023 19:39:27 GMT
- Title: Interpreting Neural Policies with Disentangled Tree Representations
- Authors: Tsun-Hsuan Wang, Wei Xiao, Tim Seyde, Ramin Hasani, Daniela Rus
- Abstract summary: We study interpretability of compact neural policies through the lens of disentangled representation.
We leverage decision trees to obtain factors of variation for disentanglement in robot learning.
We introduce interpretability metrics that measure disentanglement of learned neural dynamics.
- Score: 58.769048492254555
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The advancement of robots, particularly those functioning in complex
human-centric environments, relies on control solutions that are driven by
machine learning. Understanding how learning-based controllers make decisions
is crucial since robots are often safety-critical systems. This urges a formal
and quantitative understanding of the explanatory factors in the
interpretability of robot learning. In this paper, we aim to study
interpretability of compact neural policies through the lens of disentangled
representation. We leverage decision trees to obtain factors of variation [1]
for disentanglement in robot learning; these encapsulate skills, behaviors, or
strategies toward solving tasks. To assess how well networks uncover the
underlying task dynamics, we introduce interpretability metrics that measure
disentanglement of learned neural dynamics from a concentration of decisions,
mutual information and modularity perspective. We showcase the effectiveness of
the connection between interpretability and disentanglement consistently across
extensive experimental analysis.
Related papers
- Trustworthy Conceptual Explanations for Neural Networks in Robot Decision-Making [9.002659157558645]
We introduce a trustworthy explainable robotics technique based on human-interpretable, high-level concepts.
Our proposed technique provides explanations with associated uncertainty scores by matching neural network's activations with human-interpretable visualizations.
arXiv Detail & Related papers (2024-09-16T21:11:12Z) - Mechanistic Interpretability for AI Safety -- A Review [28.427951836334188]
This review explores mechanistic interpretability.
Mechanistic interpretability could help prevent catastrophic outcomes as AI systems become more powerful and inscrutable.
arXiv Detail & Related papers (2024-04-22T11:01:51Z) - Brain-Inspired Machine Intelligence: A Survey of
Neurobiologically-Plausible Credit Assignment [65.268245109828]
We examine algorithms for conducting credit assignment in artificial neural networks that are inspired or motivated by neurobiology.
We organize the ever-growing set of brain-inspired learning schemes into six general families and consider these in the context of backpropagation of errors.
The results of this review are meant to encourage future developments in neuro-mimetic systems and their constituent learning processes.
arXiv Detail & Related papers (2023-12-01T05:20:57Z) - Incremental procedural and sensorimotor learning in cognitive humanoid
robots [52.77024349608834]
This work presents a cognitive agent that can learn procedures incrementally.
We show the cognitive functions required in each substage and how adding new functions helps address tasks previously unsolved by the agent.
Results show that this approach is capable of solving complex tasks incrementally.
arXiv Detail & Related papers (2023-04-30T22:51:31Z) - Synergistic information supports modality integration and flexible
learning in neural networks solving multiple tasks [107.8565143456161]
We investigate the information processing strategies adopted by simple artificial neural networks performing a variety of cognitive tasks.
Results show that synergy increases as neural networks learn multiple diverse tasks.
randomly turning off neurons during training through dropout increases network redundancy, corresponding to an increase in robustness.
arXiv Detail & Related papers (2022-10-06T15:36:27Z) - Towards Benchmarking Explainable Artificial Intelligence Methods [0.0]
We use philosophy of science theories as an analytical lens with the goal of revealing, what can be expected, and more importantly, not expected, from methods that aim to explain decisions promoted by a neural network.
By conducting a case study we investigate a selection of explainability method's performance over two mundane domains, animals and headgear.
We lay bare that the usefulness of these methods relies on human domain knowledge and our ability to understand, generalise and reason.
arXiv Detail & Related papers (2022-08-25T14:28:30Z) - An Interactive Explanatory AI System for Industrial Quality Control [0.8889304968879161]
We aim to extend the defect detection task towards an interactive human-in-the-loop approach.
We propose an approach for an interactive support system for classifications in an industrial quality control setting.
arXiv Detail & Related papers (2022-03-17T09:04:46Z) - Backprop-Free Reinforcement Learning with Active Neural Generative
Coding [84.11376568625353]
We propose a computational framework for learning action-driven generative models without backpropagation of errors (backprop) in dynamic environments.
We develop an intelligent agent that operates even with sparse rewards, drawing inspiration from the cognitive theory of planning as inference.
The robust performance of our agent offers promising evidence that a backprop-free approach for neural inference and learning can drive goal-directed behavior.
arXiv Detail & Related papers (2021-07-10T19:02:27Z) - Axiom Learning and Belief Tracing for Transparent Decision Making in
Robotics [8.566457170664926]
A robot's ability to provide descriptions of its decisions and beliefs promotes effective collaboration with humans.
Our architecture couples the complementary strengths of non-monotonic logical reasoning, deep learning, and decision-tree induction.
During reasoning and learning, the architecture enables a robot to provide on-demand relational descriptions of its decisions, beliefs, and the outcomes of hypothetical actions.
arXiv Detail & Related papers (2020-10-20T22:09:17Z) - Neuro-symbolic Architectures for Context Understanding [59.899606495602406]
We propose the use of hybrid AI methodology as a framework for combining the strengths of data-driven and knowledge-driven approaches.
Specifically, we inherit the concept of neuro-symbolism as a way of using knowledge-bases to guide the learning progress of deep neural networks.
arXiv Detail & Related papers (2020-03-09T15:04:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.