Related papers: Probably Approximately Correct Explanations of Machine Learning Models via Syntax-Guided Synthesis

Probably Approximately Correct Explanations of Machine Learning Models via Syntax-Guided Synthesis

URL: http://arxiv.org/abs/2009.08770v1
Date: Fri, 18 Sep 2020 12:10:49 GMT
Title: Probably Approximately Correct Explanations of Machine Learning Models via Syntax-Guided Synthesis
Authors: Daniel Neider and Bishwamittra Ghosh
Abstract summary: We propose a novel approach to understanding the decision making of complex machine learning models (e.g., deep neural networks) using a combination of probably approximately correct learning (PAC) and a logic inference methodology called syntax-guided synthesis (SyGuS) We prove that our framework produces explanations that with a high probability make only few errors and show empirically that it is effective in generating small, human-interpretable explanations.
Score: 6.624726878647541
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a novel approach to understanding the decision making of complex machine learning models (e.g., deep neural networks) using a combination of probably approximately correct learning (PAC) and a logic inference methodology called syntax-guided synthesis (SyGuS). We prove that our framework produces explanations that with a high probability make only few errors and show empirically that it is effective in generating small, human-interpretable explanations.

Related papers

A constraints-based approach to fully interpretable neural networks for detecting learner behaviors [0.6138671548064356]
We describe a novel approach to creating a neural-network-based behavior detection model that is interpretable by design.<n>Our model is fully interpretable, meaning that the parameters we extract for our explanations have a clear interpretation.<n>We train the model to detect gaming-the-system behavior, evaluate its performance on this task, and compare its learned patterns to those identified by human experts.
arXiv Detail & Related papers (2025-04-10T16:58:11Z)
Reasoning with trees: interpreting CNNs using hierarchies [3.6763102409647526]
We introduce a framework that uses hierarchical segmentation techniques for faithful and interpretable explanations of Convolutional Neural Networks (CNNs) Our method constructs model-based hierarchical segmentations that maintain the model's reasoning fidelity. Experiments show that our framework, xAiTrees, delivers highly interpretable and faithful model explanations.
arXiv Detail & Related papers (2024-06-19T06:45:19Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z)
Learning with Explanation Constraints [91.23736536228485]
We provide a learning theoretic framework to analyze how explanations can improve the learning of our models. We demonstrate the benefits of our approach over a large array of synthetic and real-world experiments.
arXiv Detail & Related papers (2023-03-25T15:06:47Z)
Understanding Post-hoc Explainers: The Case of Anchors [6.681943980068051]
We present a theoretical analysis of a rule-based interpretability method that highlights a small set of words to explain a text's decision. After formalizing its algorithm and providing useful insights, we demonstrate mathematically that Anchors produces meaningful results.
arXiv Detail & Related papers (2023-03-15T17:56:34Z)
Explanations from Large Language Models Make Small Reasoners Better [61.991772773700006]
We show that our method can consistently and significantly outperform finetuning baselines across different settings. As a side benefit, human evaluation shows that our method can generate high-quality explanations to justify its predictions.
arXiv Detail & Related papers (2022-10-13T04:50:02Z)
Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function. Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z)
Towards Analogy-Based Explanations in Machine Learning [3.1410342959104725]
We argue that analogical reasoning is not less interesting from an interpretability and explainability point of view. An analogy-based approach is a viable alternative to existing approaches in the realm of explainable AI and interpretable machine learning.
arXiv Detail & Related papers (2020-05-23T06:41:35Z)
Ontology-based Interpretable Machine Learning for Textual Data [35.01650633374998]
We introduce a novel interpreting framework that learns an interpretable model based on sampling technique to explain prediction models. To narrow down the search space for explanations, we design a learnable anchor algorithm. A set of regulations is further introduced, regarding combining learned interpretable representations with anchors to generate comprehensible explanations.
arXiv Detail & Related papers (2020-04-01T02:51:57Z)
A general framework for scientifically inspired explanations in AI [76.48625630211943]
We instantiate the concept of structure of scientific explanation as the theoretical underpinning for a general framework in which explanations for AI systems can be implemented. This framework aims to provide the tools to build a "mental-model" of any AI system so that the interaction with the user can provide information on demand and be closer to the nature of human-made explanations.
arXiv Detail & Related papers (2020-03-02T10:32:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.