Related papers: Tell me why! -- Explanations support learning of relational and causal structure

Tell me why! -- Explanations support learning of relational and causal structure

URL: http://arxiv.org/abs/2112.03753v2
Date: Wed, 8 Dec 2021 12:48:22 GMT
Title: Tell me why! -- Explanations support learning of relational and causal structure
Authors: Andrew K. Lampinen, Nicholas A. Roy, Ishita Dasgupta, Stephanie C. Y. Chan, Allison C. Tam, James L. McClelland, Chen Yan, Adam Santoro, Neil C. Rabinowitz, Jane X. Wang, Felix Hill
Abstract summary: Explanations play a considerable role in human learning, especially in areas that remain major challenges for AI. We show that reinforcement learning agents might likewise benefit from explanations. Our results suggest that learning from explanations is a powerful principle that could offer a promising path towards training more robust and general machine learning systems.
Score: 24.434551113103105
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Explanations play a considerable role in human learning, especially in areas that remain major challenges for AI -- forming abstractions, and learning about the relational and causal structure of the world. Here, we explore whether reinforcement learning agents might likewise benefit from explanations. We outline a family of relational tasks that involve selecting an object that is the odd one out in a set (i.e., unique along one of many possible feature dimensions). Odd-one-out tasks require agents to reason over multi-dimensional relationships among a set of objects. We show that agents do not learn these tasks well from reward alone, but achieve >90% performance when they are also trained to generate language explaining object properties or why a choice is correct or incorrect. In further experiments, we show how predicting explanations enables agents to generalize appropriately from ambiguous, causally-confounded training, and even to meta-learn to perform experimental interventions to identify causal structure. We show that explanations help overcome the tendency of agents to fixate on simple features, and explore which aspects of explanations make them most beneficial. Our results suggest that learning from explanations is a powerful principle that could offer a promising path towards training more robust and general machine learning systems.

Related papers

A Theoretical Framework for Explaining Reinforcement Learning with Shapley Values [0.0]
Reinforcement learning agents can achieve super-human performance in complex decision-making tasks, but their behaviour is often difficult to understand and explain.<n>We identify three core explanatory targets that together provide a comprehensive view of reinforcement learning agents.<n>We develop a unified theoretical framework for explaining these three elements of reinforcement learning agents through the influence of individual features that the agent observes in its environment.
arXiv Detail & Related papers (2025-05-12T17:48:28Z)
Make LLMs better zero-shot reasoners: Structure-orientated autonomous reasoning [52.83539473110143]
We introduce a novel structure-oriented analysis method to help Large Language Models (LLMs) better understand a question. To further improve the reliability in complex question-answering tasks, we propose a multi-agent reasoning system, Structure-oriented Autonomous Reasoning Agents (SARA) Extensive experiments verify the effectiveness of the proposed reasoning system. Surprisingly, in some cases, the system even surpasses few-shot methods.
arXiv Detail & Related papers (2024-10-18T05:30:33Z)
Evaluating the Utility of Model Explanations for Model Development [54.23538543168767]
We evaluate whether explanations can improve human decision-making in practical scenarios of machine learning model development. To our surprise, we did not find evidence of significant improvement on tasks when users were provided with any of the saliency maps. These findings suggest caution regarding the usefulness and potential for misunderstanding in saliency-based explanations.
arXiv Detail & Related papers (2023-12-10T23:13:23Z)
Learning by Self-Explaining [23.420673675343266]
We introduce a novel workflow in the context of image classification, termed Learning by Self-Explaining (LSX) LSX utilizes aspects of self-refining AI and human-guided explanatory machine learning. Our results indicate improvements via Learning by Self-Explaining on several levels.
arXiv Detail & Related papers (2023-09-15T13:41:57Z)
A Closer Look at Reward Decomposition for High-Level Robotic Explanations [18.019811754800767]
We propose an explainable Q-Map learning framework that combines reward decomposition with abstracted action spaces. We demonstrate the effectiveness of our framework through quantitative and qualitative analysis of two robotic scenarios.
arXiv Detail & Related papers (2023-04-25T16:01:42Z)
Complementary Explanations for Effective In-Context Learning [77.83124315634386]
Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts. This work aims to better understand the mechanisms by which explanations are used for in-context learning.
arXiv Detail & Related papers (2022-11-25T04:40:47Z)
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small [68.879023473838]
We present an explanation for how GPT-2 small performs a natural language task called indirect object identification (IOI) To our knowledge, this investigation is the largest end-to-end attempt at reverse-engineering a natural behavior "in the wild" in a language model.
arXiv Detail & Related papers (2022-11-01T17:08:44Z)
Rethinking Explainability as a Dialogue: A Practitioner's Perspective [57.87089539718344]
We ask doctors, healthcare professionals, and policymakers about their needs and desires for explanations. Our study indicates that decision-makers would strongly prefer interactive explanations in the form of natural language dialogues. Considering these needs, we outline a set of five principles researchers should follow when designing interactive explanations.
arXiv Detail & Related papers (2022-02-03T22:17:21Z)
Human Interpretation of Saliency-based Explanation Over Text [65.29015910991261]
We study saliency-based explanations over textual data. We find that people often mis-interpret the explanations. We propose a method to adjust saliencies based on model estimates of over- and under-perception.
arXiv Detail & Related papers (2022-01-27T15:20:32Z)
Inherently Explainable Reinforcement Learning in Natural Language [14.117921448623342]
We focus on the task of creating a reinforcement learning agent that is inherently explainable. This Hierarchically Explainable Reinforcement Learning agent operates in Interactive Fictions, text-based game environments. Our agent is designed to treat explainability as a first-class citizen.
arXiv Detail & Related papers (2021-12-16T14:24:35Z)
Are We On The Same Page? Hierarchical Explanation Generation for Planning Tasks in Human-Robot Teaming using Reinforcement Learning [0.0]
We argue that the agent-generated explanations should be abstracted to be aligned with the level of details the human teammate desires to maintain the recipient's cognitive load. We show that hierarchical explanations achieved better task performance and behavior interpretability while reduced cognitive load.
arXiv Detail & Related papers (2020-12-22T02:14:52Z)
What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes [30.056732656973637]
We present a novel form of explanation for Reinforcement Learning, based around the notion of intended outcome. These explanations describe the outcome an agent is trying to achieve by its actions. We provide a simple proof that general methods for post-hoc explanations of this nature are impossible in traditional reinforcement learning.
arXiv Detail & Related papers (2020-11-10T12:05:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.