Related papers: Information based explanation methods for deep learning agents -- with applications on large open-source chess models

Information based explanation methods for deep learning agents -- with applications on large open-source chess models

URL: http://arxiv.org/abs/2309.09702v1
Date: Mon, 18 Sep 2023 12:08:14 GMT
Title: Information based explanation methods for deep learning agents -- with applications on large open-source chess models
Authors: Patrik Hammersborg and Inga Str\"umke
Abstract summary: This work presents the re-implementation of the concept detection methodology applied to AlphaZero. We obtain results similar to those achieved on AlphaZero, while relying solely on open-source resources. We also present a novel explainable AI (XAI) method, which is guaranteed to highlight exhaustively and exclusively the information used by the explained model.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With large chess-playing neural network models like AlphaZero contesting the state of the art within the world of computerised chess, two challenges present themselves: The question of how to explain the domain knowledge internalised by such models, and the problem that such models are not made openly available. This work presents the re-implementation of the concept detection methodology applied to AlphaZero in McGrath et al. (2022), by using large, open-source chess models with comparable performance. We obtain results similar to those achieved on AlphaZero, while relying solely on open-source resources. We also present a novel explainable AI (XAI) method, which is guaranteed to highlight exhaustively and exclusively the information used by the explained model. This method generates visual explanations tailored to domains characterised by discrete input spaces, as is the case for chess. Our presented method has the desirable property of controlling the information flow between any input vector and the given model, which in turn provides strict guarantees regarding what information is used by the trained model during inference. We demonstrate the viability of our method by applying it to standard 8x8 chess, using large open-source chess models.

Related papers

Towards Better Generalization in Open-Domain Question Answering by Mitigating Context Memorization [67.92796510359595]
Open-domain Question Answering (OpenQA) aims at answering factual questions with an external large-scale knowledge corpus. It is still unclear how well an OpenQA model can transfer to completely new knowledge domains. We introduce Corpus-Invariant Tuning (CIT), a simple but effective training strategy, to mitigate the knowledge over-memorization.
arXiv Detail & Related papers (2024-04-02T05:44:50Z)
Continual Zero-Shot Learning through Semantically Guided Generative Random Walks [56.65465792750822]
We address the challenge of continual zero-shot learning where unseen information is not provided during training, by leveraging generative modeling. We propose our learning algorithm that employs a novel semantically guided Generative Random Walk (GRW) loss. Our algorithm achieves state-of-the-art performance on AWA1, AWA2, CUB, and SUN datasets, surpassing existing CZSL methods by 3-7%.
arXiv Detail & Related papers (2023-08-23T18:10:12Z)
Recognizing Unseen Objects via Multimodal Intensive Knowledge Graph Propagation [68.13453771001522]
We propose a multimodal intensive ZSL framework that matches regions of images with corresponding semantic embeddings. We conduct extensive experiments and evaluate our model on large-scale real-world data.
arXiv Detail & Related papers (2023-06-14T13:07:48Z)
Preserving Knowledge Invariance: Rethinking Robustness Evaluation of Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world. We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique. By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z)
Foiling Explanations in Deep Neural Networks [0.0]
This paper uncovers a troubling property of explanation methods for image-based DNNs. We demonstrate how explanations may be arbitrarily manipulated through the use of evolution strategies. Our novel algorithm is successfully able to manipulate an image in a manner imperceptible to the human eye.
arXiv Detail & Related papers (2022-11-27T15:29:39Z)
Greybox XAI: a Neural-Symbolic learning framework to produce interpretable predictions for image classification [6.940242990198]
Greybox XAI is a framework that composes a DNN and a transparent model thanks to the use of a symbolic Knowledge Base (KB) We address the problem of the lack of universal criteria for XAI by formalizing what an explanation is. We show how this new architecture is accurate and explainable in several datasets.
arXiv Detail & Related papers (2022-09-26T08:55:31Z)
Knowledge-driven Active Learning [70.37119719069499]
Active learning strategies aim at minimizing the amount of labelled data required to train a Deep Learning model. Most active strategies are based on uncertain sample selection, and even often restricted to samples lying close to the decision boundary. Here we propose to take into consideration common domain-knowledge and enable non-expert users to train a model with fewer samples.
arXiv Detail & Related papers (2021-10-15T06:11:53Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation [22.688772441351308]
Methods based on class activation mapping and randomized input sampling have gained great popularity. However, the attribution methods provide lower resolution and blurry explanation maps that limit their explanation power. In this work, we collect visualization maps from multiple layers of the model based on an attribution-based input sampling technique. We also propose a layer selection strategy that applies to the whole family of CNN-based models.
arXiv Detail & Related papers (2020-10-01T20:27:30Z)
Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL) In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z)
LIMEADE: From AI Explanations to Advice Taking [34.581205516506614]
We introduce LIMEADE, the first framework that translates both positive and negative advice into an update to an arbitrary, underlying opaque model. We show our method improves accuracy compared to a rigorous baseline on the image classification domains. For the text modality, we apply our framework to a neural recommender system for scientific papers on a public website.
arXiv Detail & Related papers (2020-03-09T18:00:00Z)
Learning Discrete State Abstractions With Deep Variational Inference [7.273663549650618]
We propose a method for learning approximate bisimulations, a type of state abstraction. We use a deep neural encoder to map states onto continuous embeddings. We map these embeddings onto a discrete representation using an action-conditioned hidden Markov model.
arXiv Detail & Related papers (2020-03-09T17:58:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.