Posthoc Interpretation via Quantization
- URL: http://arxiv.org/abs/2303.12659v2
- Date: Sat, 27 May 2023 12:26:23 GMT
- Title: Posthoc Interpretation via Quantization
- Authors: Francesco Paissan, Cem Subakan, Mirco Ravanelli
- Abstract summary: We introduce a new approach, called Posthoc Interpretation via Quantization (PIQ), for interpreting decisions made by trained classifiers.
Our method utilizes vector quantization to transform the representations of a classifier into a discrete, class-specific latent space.
Our model formulation also enables learning concepts by incorporating the supervision of pretrained annotation models.
- Score: 9.510336895838703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce a new approach, called Posthoc Interpretation via
Quantization (PIQ), for interpreting decisions made by trained classifiers. Our
method utilizes vector quantization to transform the representations of a
classifier into a discrete, class-specific latent space. The class-specific
codebooks act as a bottleneck that forces the interpreter to focus on the parts
of the input data deemed relevant by the classifier for making a prediction.
Our model formulation also enables learning concepts by incorporating the
supervision of pretrained annotation models such as state-of-the-art image
segmentation models. We evaluated our method through quantitative and
qualitative studies involving black-and-white images, color images, and audio.
As a result of these studies we found that PIQ generates interpretations that
are more easily understood by participants to our user studies when compared to
several other interpretation methods in the literature.
Related papers
- Interpretable Network Visualizations: A Human-in-the-Loop Approach for Post-hoc Explainability of CNN-based Image Classification [5.087579454836169]
State-of-the-art explainability methods generate saliency maps to show where a specific class is identified.
We introduce a post-hoc method that explains the entire feature extraction process of a Convolutional Neural Network.
We also show an approach to generate global explanations by aggregating labels across multiple images.
arXiv Detail & Related papers (2024-05-06T09:21:35Z) - Revisiting Self-supervised Learning of Speech Representation from a
Mutual Information Perspective [68.20531518525273]
We take a closer look into existing self-supervised methods of speech from an information-theoretic perspective.
We use linear probes to estimate the mutual information between the target information and learned representations.
We explore the potential of evaluating representations in a self-supervised fashion, where we estimate the mutual information between different parts of the data without using any labels.
arXiv Detail & Related papers (2024-01-16T21:13:22Z) - TExplain: Explaining Learned Visual Features via Pre-trained (Frozen) Language Models [14.019349267520541]
We propose a novel method that leverages the capabilities of language models to interpret the learned features of pre-trained image classifiers.
Our approach generates a vast number of sentences to explain the features learned by the classifier for a given image.
Our method, for the first time, utilizes these frequent words corresponding to a visual representation to provide insights into the decision-making process.
arXiv Detail & Related papers (2023-09-01T20:59:46Z) - A Test Statistic Estimation-based Approach for Establishing
Self-interpretable CNN-based Binary Classifiers [7.424003880270276]
Post-hoc interpretability methods have the limitation that they can produce plausible but different interpretations.
The proposed method is self-interpretable, quantitative. Unlike the traditional post-hoc interpretability methods, the proposed method is self-interpretable, quantitative.
arXiv Detail & Related papers (2023-03-13T05:51:35Z) - Measuring the Interpretability of Unsupervised Representations via
Quantized Reverse Probing [97.70862116338554]
We investigate the problem of measuring interpretability of self-supervised representations.
We formulate the latter as estimating the mutual information between the representation and a space of manually labelled concepts.
We use our method to evaluate a large number of self-supervised representations, ranking them by interpretability.
arXiv Detail & Related papers (2022-09-07T16:18:50Z) - A Unified Understanding of Deep NLP Models for Text Classification [88.35418976241057]
We have developed a visual analysis tool, DeepNLPVis, to enable a unified understanding of NLP models for text classification.
The key idea is a mutual information-based measure, which provides quantitative explanations on how each layer of a model maintains the information of input words in a sample.
A multi-level visualization, which consists of a corpus-level, a sample-level, and a word-level visualization, supports the analysis from the overall training set to individual samples.
arXiv Detail & Related papers (2022-06-19T08:55:07Z) - Autoregressive Co-Training for Learning Discrete Speech Representations [19.400428010647573]
We consider a generative model with discrete latent variables that learns a discrete representation for speech.
We find that the proposed approach learns discrete representation that is highly correlated with phonetic units.
arXiv Detail & Related papers (2022-03-29T18:17:18Z) - Resolving label uncertainty with implicit posterior models [71.62113762278963]
We propose a method for jointly inferring labels across a collection of data samples.
By implicitly assuming the existence of a generative model for which a differentiable predictor is the posterior, we derive a training objective that allows learning under weak beliefs.
arXiv Detail & Related papers (2022-02-28T18:09:44Z) - Fair Interpretable Representation Learning with Correction Vectors [60.0806628713968]
We propose a new framework for fair representation learning that is centered around the learning of "correction vectors"
We show experimentally that several fair representation learning models constrained in such a way do not exhibit losses in ranking or classification performance.
arXiv Detail & Related papers (2022-02-07T11:19:23Z) - Instance-Based Learning of Span Representations: A Case Study through
Named Entity Recognition [48.06319154279427]
We present a method of instance-based learning that learns similarities between spans.
Our method enables to build models that have high interpretability without sacrificing performance.
arXiv Detail & Related papers (2020-04-29T23:32:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.