Related papers: Interpretability with full complexity by constraining feature information

Interpretability with full complexity by constraining feature information

URL: http://arxiv.org/abs/2211.17264v1
Date: Wed, 30 Nov 2022 18:59:01 GMT
Title: Interpretability with full complexity by constraining feature information
Authors: Kieran A. Murphy, Dani S. Bassett
Abstract summary: Interpretability is a pressing issue for machine learning. We approach interpretability from a new angle: constrain the information about the features without restricting the complexity of the model. We develop a framework for extracting insight from the spectrum of approximate models.
Score: 1.52292571922932
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Interpretability is a pressing issue for machine learning. Common approaches to interpretable machine learning constrain interactions between features of the input, rendering the effects of those features on a model's output comprehensible but at the expense of model complexity. We approach interpretability from a new angle: constrain the information about the features without restricting the complexity of the model. Borrowing from information theory, we use the Distributed Information Bottleneck to find optimal compressions of each feature that maximally preserve information about the output. The learned information allocation, by feature and by feature value, provides rich opportunities for interpretation, particularly in problems with many features and complex feature interactions. The central object of analysis is not a single trained model, but rather a spectrum of models serving as approximations that leverage variable amounts of information about the inputs. Information is allocated to features by their relevance to the output, thereby solving the problem of feature selection by constructing a learned continuum of feature inclusion-to-exclusion. The optimal compression of each feature -- at every stage of approximation -- allows fine-grained inspection of the distinctions among feature values that are most impactful for prediction. We develop a framework for extracting insight from the spectrum of approximate models and demonstrate its utility on a range of tabular datasets.

Related papers

A Critical Assessment of Interpretable and Explainable Machine Learning for Intrusion Detection [0.0]
We study the use of overly complex and opaque ML models, unaccounted data imbalances and correlated features, inconsistent influential features across different explanation methods, and the implausible utility of explanations. Specifically, we advise avoiding complex opaque models such as Deep Neural Networks and instead using interpretable ML models such as Decision Trees. We find that feature-based model explanations are most often inconsistent across different settings.
arXiv Detail & Related papers (2024-07-04T15:35:42Z)
Explaining Predictive Uncertainty with Information Theoretic Shapley Values [6.49838460559032]
We adapt the popular Shapley value framework to explain various types of predictive uncertainty. We implement efficient algorithms that perform well in a range of experiments on real and simulated data.
arXiv Detail & Related papers (2023-06-09T07:43:46Z)
On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features. Based on these observations, we propose a conceptual framework for feature learning. Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z)
Balancing Explainability-Accuracy of Complex Models [8.402048778245165]
We introduce a new approach for complex models based on the co-relation impact. We propose approaches for both scenarios of independent features and dependent features. We provide an upper bound of the complexity of our proposed approach for the dependent features.
arXiv Detail & Related papers (2023-05-23T14:20:38Z)
Relational Local Explanations [11.679389861042]
We develop a novel model-agnostic and permutation-based feature attribution algorithm based on relational analysis between input variables. We are able to gain a broader insight into machine learning model decisions and data.
arXiv Detail & Related papers (2022-12-23T14:46:23Z)
Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data. Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z)
A Functional Information Perspective on Model Interpretation [30.101107406343665]
This work suggests a theoretical framework for model interpretability. We rely on the log-Sobolev inequality that bounds the functional entropy by the functional Fisher information. We show that our method surpasses existing interpretability sampling-based methods on various data signals.
arXiv Detail & Related papers (2022-06-12T09:24:45Z)
An Empirical Investigation of Commonsense Self-Supervision with Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models. We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z)
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning. Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
Model-agnostic multi-objective approach for the evolutionary discovery of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results. We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z)
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference. Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.