Towards a Research Community in Interpretable Reinforcement Learning: the InterpPol Workshop
- URL: http://arxiv.org/abs/2404.10906v1
- Date: Tue, 16 Apr 2024 20:53:17 GMT
- Title: Towards a Research Community in Interpretable Reinforcement Learning: the InterpPol Workshop
- Authors: Hector Kohler, Quentin Delfosse, Paul Festor, Philippe Preux,
- Abstract summary: Embracing the pursuit of intrinsically explainable reinforcement learning raises crucial questions.
Should explainable and interpretable agents be developed outside of domains where transparency is imperative?
How can we rigorously define and measure interpretability in policies, without user studies?
- Score: 7.630967411418269
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Embracing the pursuit of intrinsically explainable reinforcement learning raises crucial questions: what distinguishes explainability from interpretability? Should explainable and interpretable agents be developed outside of domains where transparency is imperative? What advantages do interpretable policies offer over neural networks? How can we rigorously define and measure interpretability in policies, without user studies? What reinforcement learning paradigms,are the most suited to develop interpretable agents? Can Markov Decision Processes integrate interpretable state representations? In addition to motivate an Interpretable RL community centered around the aforementioned questions, we propose the first venue dedicated to Interpretable RL: the InterpPol Workshop.
Related papers
- Understanding Understanding: A Pragmatic Framework Motivated by Large Language Models [13.279760256875127]
In Turing-test fashion, the framework is based solely on the agent's performance, and specifically on how well it answers questions.
We show how high confidence can be achieved via random sampling and the application of probabilistic confidence bounds.
arXiv Detail & Related papers (2024-06-16T13:37:08Z) - Crafting Interpretable Embeddings by Asking LLMs Questions [89.49960984640363]
Large language models (LLMs) have rapidly improved text embeddings for a growing array of natural-language processing tasks.
We introduce question-answering embeddings (QA-Emb), embeddings where each feature represents an answer to a yes/no question asked to an LLM.
We use QA-Emb to flexibly generate interpretable models for predicting fMRI voxel responses to language stimuli.
arXiv Detail & Related papers (2024-05-26T22:30:29Z) - Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs [58.620269228776294]
We propose a task-agnostic framework for resolving ambiguity by asking users clarifying questions.
We evaluate systems across three NLP applications: question answering, machine translation and natural language inference.
We find that intent-sim is robust, demonstrating improvements across a wide range of NLP tasks and LMs.
arXiv Detail & Related papers (2023-11-16T00:18:50Z) - Understanding Self-Supervised Learning of Speech Representation via
Invariance and Redundancy Reduction [0.45060992929802207]
Self-supervised learning (SSL) has emerged as a promising paradigm for learning flexible speech representations from unlabeled data.
This study provides an empirical analysis of Barlow Twins (BT), an SSL technique inspired by theories of redundancy reduction in human perception.
arXiv Detail & Related papers (2023-09-07T10:23:59Z) - Abstracting Concept-Changing Rules for Solving Raven's Progressive
Matrix Problems [54.26307134687171]
Raven's Progressive Matrix (RPM) is a classic test to realize such ability in machine intelligence by selecting from candidates.
Recent studies suggest that solving RPM in an answer-generation way boosts a more in-depth understanding of rules.
We propose a deep latent variable model for Concept-changing Rule ABstraction (CRAB) by learning interpretable concepts and parsing concept-changing rules in the latent space.
arXiv Detail & Related papers (2023-07-15T07:16:38Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - Interpretable and Explainable Logical Policies via Neurally Guided
Symbolic Abstraction [23.552659248243806]
We introduce Neurally gUided Differentiable loGic policiEs (NUDGE)
NUDGE exploits trained neural network-based agents to guide the search of candidate-weighted logic rules, then uses differentiable logic to train the logic agents.
Our experimental evaluation demonstrates that NUDGE agents can induce interpretable and explainable policies while outperforming purely neural ones and showing good flexibility to environments of different initial states and problem sizes.
arXiv Detail & Related papers (2023-06-02T10:59:44Z) - A Survey on Interpretable Reinforcement Learning [28.869513255570077]
This survey provides an overview of various approaches to achieve higher interpretability in reinforcement learning (RL)
We distinguish interpretability (as a property of a model) and explainability (as a post-hoc operation, with the intervention of a proxy)
We argue that interpretable RL may embrace different facets: interpretable inputs, interpretable (transition/reward) models, and interpretable decision-making.
arXiv Detail & Related papers (2021-12-24T17:26:57Z) - i-Algebra: Towards Interactive Interpretability of Deep Neural Networks [41.13047686374529]
We present i-Algebra, a first-of-its-kind interactive framework for interpreting deep neural networks (DNNs)
At its core is a library of atomic, composable operators, which explain model behaviors at varying input granularity, during different inference stages, and from distinct interpretation perspectives.
We conduct user studies in a set of representative analysis tasks, including inspecting adversarial inputs, resolving model inconsistency, and cleansing contaminated data, all demonstrating its promising usability.
arXiv Detail & Related papers (2021-01-22T19:22:57Z) - Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL)
In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z) - Emergence of Pragmatics from Referential Game between Theory of Mind
Agents [64.25696237463397]
We propose an algorithm, using which agents can spontaneously learn the ability to "read between lines" without any explicit hand-designed rules.
We integrate the theory of mind (ToM) in a cooperative multi-agent pedagogical situation and propose an adaptive reinforcement learning (RL) algorithm to develop a communication protocol.
arXiv Detail & Related papers (2020-01-21T19:37:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.