Tell me why! -- Explanations support learning of relational and causal
structure
- URL: http://arxiv.org/abs/2112.03753v2
- Date: Wed, 8 Dec 2021 12:48:22 GMT
- Title: Tell me why! -- Explanations support learning of relational and causal
structure
- Authors: Andrew K. Lampinen, Nicholas A. Roy, Ishita Dasgupta, Stephanie C. Y.
Chan, Allison C. Tam, James L. McClelland, Chen Yan, Adam Santoro, Neil C.
Rabinowitz, Jane X. Wang, Felix Hill
- Abstract summary: Explanations play a considerable role in human learning, especially in areas that remain major challenges for AI.
We show that reinforcement learning agents might likewise benefit from explanations.
Our results suggest that learning from explanations is a powerful principle that could offer a promising path towards training more robust and general machine learning systems.
- Score: 24.434551113103105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explanations play a considerable role in human learning, especially in areas
that remain major challenges for AI -- forming abstractions, and learning about
the relational and causal structure of the world. Here, we explore whether
reinforcement learning agents might likewise benefit from explanations. We
outline a family of relational tasks that involve selecting an object that is
the odd one out in a set (i.e., unique along one of many possible feature
dimensions). Odd-one-out tasks require agents to reason over multi-dimensional
relationships among a set of objects. We show that agents do not learn these
tasks well from reward alone, but achieve >90% performance when they are also
trained to generate language explaining object properties or why a choice is
correct or incorrect. In further experiments, we show how predicting
explanations enables agents to generalize appropriately from ambiguous,
causally-confounded training, and even to meta-learn to perform experimental
interventions to identify causal structure. We show that explanations help
overcome the tendency of agents to fixate on simple features, and explore which
aspects of explanations make them most beneficial. Our results suggest that
learning from explanations is a powerful principle that could offer a promising
path towards training more robust and general machine learning systems.
Related papers
- Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning [52.83539473110143]
We introduce a novel structure-oriented analysis method to help Large Language Models (LLMs) better understand a question.
To further improve the reliability in complex question-answering tasks, we propose a multi-agent reasoning system, Structure-oriented Autonomous Reasoning Agents (SARA)
Extensive experiments verify the effectiveness of the proposed reasoning system. Surprisingly, in some cases, the system even surpasses few-shot methods.
arXiv Detail & Related papers (2024-10-18T05:30:33Z) - Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development.
To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps.
These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z) - Learning by Self-Explaining [23.420673675343266]
We introduce a novel workflow in the context of image classification, termed Learning by Self-Explaining (LSX)
LSX utilizes aspects of self-refining AI and human-guided explanatory machine learning.
Our results indicate improvements via Learning by Self-Explaining on several levels.
arXiv Detail & Related papers (2023-09-15T13:41:57Z) - A Closer Look at Reward Decomposition for High-Level Robotic
Explanations [18.019811754800767]
We propose an explainable Q-Map learning framework that combines reward decomposition with abstracted action spaces.
We demonstrate the effectiveness of our framework through quantitative and qualitative analysis of two robotic scenarios.
arXiv Detail & Related papers (2023-04-25T16:01:42Z) - Complementary Explanations for Effective In-Context Learning [77.83124315634386]
Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts.
This work aims to better understand the mechanisms by which explanations are used for in-context learning.
arXiv Detail & Related papers (2022-11-25T04:40:47Z) - Interpretability in the Wild: a Circuit for Indirect Object
Identification in GPT-2 small [68.879023473838]
We present an explanation for how GPT-2 small performs a natural language task called indirect object identification (IOI)
To our knowledge, this investigation is the largest end-to-end attempt at reverse-engineering a natural behavior "in the wild" in a language model.
arXiv Detail & Related papers (2022-11-01T17:08:44Z) - Rethinking Explainability as a Dialogue: A Practitioner's Perspective [57.87089539718344]
We ask doctors, healthcare professionals, and policymakers about their needs and desires for explanations.
Our study indicates that decision-makers would strongly prefer interactive explanations in the form of natural language dialogues.
Considering these needs, we outline a set of five principles researchers should follow when designing interactive explanations.
arXiv Detail & Related papers (2022-02-03T22:17:21Z) - Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data.
We find that people often mis-interpret the explanations.
We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z) - Inherently Explainable Reinforcement Learning in Natural Language [14.117921448623342]
We focus on the task of creating a reinforcement learning agent that is inherently explainable.
This Hierarchically Explainable Reinforcement Learning agent operates in Interactive Fictions, text-based game environments.
Our agent is designed to treat explainability as a first-class citizen.
arXiv Detail & Related papers (2021-12-16T14:24:35Z) - Are We On The Same Page? Hierarchical Explanation Generation for
Planning Tasks in Human-Robot Teaming using Reinforcement Learning [0.0]
We argue that the agent-generated explanations should be abstracted to be aligned with the level of details the human teammate desires to maintain the recipient's cognitive load.
We show that hierarchical explanations achieved better task performance and behavior interpretability while reduced cognitive load.
arXiv Detail & Related papers (2020-12-22T02:14:52Z) - What Did You Think Would Happen? Explaining Agent Behaviour Through
Intended Outcomes [30.056732656973637]
We present a novel form of explanation for Reinforcement Learning, based around the notion of intended outcome.
These explanations describe the outcome an agent is trying to achieve by its actions.
We provide a simple proof that general methods for post-hoc explanations of this nature are impossible in traditional reinforcement learning.
arXiv Detail & Related papers (2020-11-10T12:05:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.