Related papers: Explainable Deep Reinforcement Learning: State of the Art and Challenges

Explainable Deep Reinforcement Learning: State of the Art and Challenges

URL: http://arxiv.org/abs/2301.09937v1
Date: Tue, 24 Jan 2023 11:41:25 GMT
Title: Explainable Deep Reinforcement Learning: State of the Art and Challenges
Authors: George A. Vouros
Abstract summary: Interpretability, explainability and transparency are key issues to introducing Artificial Intelligence methods in many critical domains. This article provides a review of state of the art methods for explainable deep reinforcement learning methods.
Score: 1.005130974691351
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Interpretability, explainability and transparency are key issues to introducing Artificial Intelligence methods in many critical domains: This is important due to ethical concerns and trust issues strongly connected to reliability, robustness, auditability and fairness, and has important consequences towards keeping the human in the loop in high levels of automation, especially in critical cases for decision making, where both (human and the machine) play important roles. While the research community has given much attention to explainability of closed (or black) prediction boxes, there are tremendous needs for explainability of closed-box methods that support agents to act autonomously in the real world. Reinforcement learning methods, and especially their deep versions, are such closed-box methods. In this article we aim to provide a review of state of the art methods for explainable deep reinforcement learning methods, taking also into account the needs of human operators - i.e., of those that take the actual and critical decisions in solving real-world problems. We provide a formal specification of the deep reinforcement learning explainability problems, and we identify the necessary components of a general explainable reinforcement learning framework. Based on these, we provide a comprehensive review of state of the art methods, categorizing them in classes according to the paradigm they follow, the interpretable models they use, and the surface representation of explanations provided. The article concludes identifying open questions and important challenges.

Related papers

Intrinsic Barriers to Explaining Deep Foundation Models [17.952353851860742]
Deep Foundation Models (DFMs) offer unprecedented capabilities but their increasing complexity presents profound challenges to understanding their internal workings. This paper delves into this critical question by examining the fundamental characteristics of DFMs and scrutinizing the limitations encountered by current explainability methods.
arXiv Detail & Related papers (2025-04-21T21:19:23Z)
Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1) [66.51642638034822]
Reasoning is central to human intelligence, enabling structured problem-solving across diverse tasks. Recent advances in large language models (LLMs) have greatly enhanced their reasoning abilities in arithmetic, commonsense, and symbolic domains. This paper offers a concise yet insightful overview of reasoning techniques in both textual and multimodal LLMs.
arXiv Detail & Related papers (2025-04-04T04:04:56Z)
The Superalignment of Superhuman Intelligence with Large Language Models [63.96120398355404]
We discuss the concept of superalignment from the learning perspective to answer this question. We highlight some key research problems in superalignment, namely, weak-to-strong generalization, scalable oversight, and evaluation. We present a conceptual framework for superalignment, which consists of three modules: an attacker which generates adversary queries trying to expose the weaknesses of a learner model; a learner which will refine itself by learning from scalable feedbacks generated by a critic model along with minimal human experts; and a critic which generates critics or explanations for a given query-response pair, with a target of improving the learner by criticizing.
arXiv Detail & Related papers (2024-12-15T10:34:06Z)
When Can You Trust Your Explanations? A Robustness Analysis on Feature Importances [42.36530107262305]
robustness of explanations plays a central role in ensuring trust in both the system and the provided explanation. We propose a novel approach to analyse the robustness of neural network explanations to non-adversarial perturbations. We additionally present an ensemble method to aggregate various explanations, showing how merging explanations can be beneficial for both understanding the model's decision and evaluating the robustness.
arXiv Detail & Related papers (2024-06-20T14:17:57Z)
A Comprehensive Review on Financial Explainable AI [29.229196780505532]
We provide a comparative survey of methods that aim to improve the explainability of deep learning models within the context of finance. We categorize the collection of explainable AI methods according to their corresponding characteristics. We review the concerns and challenges of adopting explainable AI methods, together with future directions we deemed appropriate and important.
arXiv Detail & Related papers (2023-09-21T10:30:49Z)
Causal Reinforcement Learning: A Survey [57.368108154871]
Reinforcement learning is an essential paradigm for solving sequential decision problems under uncertainty. One of the main obstacles is that reinforcement learning agents lack a fundamental understanding of the world. Causality offers a notable advantage as it can formalize knowledge in a systematic manner.
arXiv Detail & Related papers (2023-07-04T03:00:43Z)
Reinforcement Learning with Knowledge Representation and Reasoning: A Brief Survey [24.81327556378729]
Reinforcement Learning has achieved tremendous development in recent years. Still faces significant obstacles in addressing complex real-life problems. Recently, there has been a rapidly growing interest in the use of Knowledge Representation and Reasoning.
arXiv Detail & Related papers (2023-04-24T13:35:11Z)
SoK: Modeling Explainability in Security Analytics for Interpretability, Trustworthiness, and Usability [2.656910687062026]
Interpretability, trustworthiness, and usability are key considerations in high-stake security applications. Deep learning models behave as black boxes in which identifying important features and factors that led to a classification or a prediction is difficult. Most explanation methods provide inconsistent explanations, have low fidelity, and are susceptible to adversarial manipulation.
arXiv Detail & Related papers (2022-10-31T15:01:49Z)
A.I. Robustness: a Human-Centered Perspective on Technological Challenges and Opportunities [8.17368686298331]
Robustness of Artificial Intelligence (AI) systems remains elusive and constitutes a key issue that impedes large-scale adoption. We introduce three concepts to organize and describe the literature both from a fundamental and applied point of view. We highlight the central role of humans in evaluating and enhancing AI robustness, considering the necessary knowledge humans can provide.
arXiv Detail & Related papers (2022-10-17T10:00:51Z)
Individual Explanations in Machine Learning Models: A Case Study on Poverty Estimation [63.18666008322476]
Machine learning methods are being increasingly applied in sensitive societal contexts. The present case study has two main objectives. First, to expose these challenges and how they affect the use of relevant and novel explanations methods. And second, to present a set of strategies that mitigate such challenges, as faced when implementing explanation methods in a relevant application domain.
arXiv Detail & Related papers (2021-04-09T01:54:58Z)
Individual Explanations in Machine Learning Models: A Survey for Practitioners [69.02688684221265]
The use of sophisticated statistical models that influence decisions in domains of high societal relevance is on the rise. Many governments, institutions, and companies are reluctant to their adoption as their output is often difficult to explain in human-interpretable ways. Recently, the academic literature has proposed a substantial amount of methods for providing interpretable explanations to machine learning models.
arXiv Detail & Related papers (2021-04-09T01:46:34Z)
Uncertainty as a Form of Transparency: Measuring, Communicating, and Using Uncertainty [66.17147341354577]
We argue for considering a complementary form of transparency by estimating and communicating the uncertainty associated with model predictions. We describe how uncertainty can be used to mitigate model unfairness, augment decision-making, and build trustworthy systems. This work constitutes an interdisciplinary review drawn from literature spanning machine learning, visualization/HCI, design, decision-making, and fairness.
arXiv Detail & Related papers (2020-11-15T17:26:14Z)
Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL) In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z)
Fanoos: Multi-Resolution, Multi-Strength, Interactive Explanations for Learned Systems [0.0]
Fanoos is a framework for combining formal verification techniques, search, and user interaction to explore explanations at the desired level of granularity and fidelity. We demonstrate the ability of Fanoos to produce and adjust the abstractness of explanations in response to user requests on a learned controller for an inverted double pendulum and on a learned CPU usage model.
arXiv Detail & Related papers (2020-06-22T17:35:53Z)
Neuro-symbolic Architectures for Context Understanding [59.899606495602406]
We propose the use of hybrid AI methodology as a framework for combining the strengths of data-driven and knowledge-driven approaches. Specifically, we inherit the concept of neuro-symbolism as a way of using knowledge-bases to guide the learning progress of deep neural networks.
arXiv Detail & Related papers (2020-03-09T15:04:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.