Self-Interpretable Model with TransformationEquivariant Interpretation
- URL: http://arxiv.org/abs/2111.04927v1
- Date: Tue, 9 Nov 2021 03:21:25 GMT
- Title: Self-Interpretable Model with TransformationEquivariant Interpretation
- Authors: Yipei Wang, Xiaoqian Wang
- Abstract summary: We propose a self-interpretable model SITE with transformation-equivariant interpretations.
We focus on the robustness and self-consistency of the interpretations of geometric transformations.
- Score: 11.561544418482585
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a self-interpretable model SITE with
transformation-equivariant interpretations. We focus on the robustness and
self-consistency of the interpretations of geometric transformations. Apart
from the transformation equivariance, as a self-interpretable model, SITE has
comparable expressive power as the benchmark black-box classifiers, while being
able to present faithful and robust interpretations with high quality. It is
worth noticing that although applied in most of the CNN visualization methods,
the bilinear upsampling approximation is a rough approximation, which can only
provide interpretations in the form of heatmaps (instead of pixel-wise). It
remains an open question whether such interpretations can be direct to the
input space (as shown in the MNIST experiments). Besides, we consider the
translation and rotation transformations in our model. In future work, we will
explore the robust interpretations under more complex transformations such as
scaling and distortion. Moreover, we clarify that SITE is not limited to
geometric transformation (that we used in the computer vision domain), and will
explore SITEin other domains in future work.
Related papers
- Enforcing Interpretability in Time Series Transformers: A Concept Bottleneck Framework [2.8470354623829577]
We develop a framework based on Concept Bottleneck Models to enforce interpretability of time series Transformers.
We modify the training objective to encourage a model to develop representations similar to predefined interpretable concepts.
We find that the model performance remains mostly unaffected, while the model shows much improved interpretability.
arXiv Detail & Related papers (2024-10-08T14:22:40Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Latent Space Translation via Semantic Alignment [29.2401314068038]
We show how representations learned from different neural modules can be translated between different pre-trained networks.
Our method directly estimates a transformation between two given latent spaces, thereby enabling effective stitching of encoders and decoders without additional training.
Notably, we show how it is possible to zero-shot stitch text encoders and vision decoders, or vice-versa, yielding surprisingly good classification performance in this multimodal setting.
arXiv Detail & Related papers (2023-11-01T17:12:00Z) - Flow Factorized Representation Learning [109.51947536586677]
We introduce a generative model which specifies a distinct set of latent probability paths that define different input transformations.
We show that our model achieves higher likelihoods on standard representation learning benchmarks while simultaneously being closer to approximately equivariant models.
arXiv Detail & Related papers (2023-09-22T20:15:37Z) - B-cos Alignment for Inherently Interpretable CNNs and Vision
Transformers [97.75725574963197]
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training.
We show that a sequence of such transformations induces a single linear transformation that faithfully summarises the full model computations.
We show that the resulting explanations are of high visual quality and perform well under quantitative interpretability metrics.
arXiv Detail & Related papers (2023-06-19T12:54:28Z) - Learning Semantic Textual Similarity via Topic-informed Discrete Latent
Variables [17.57873577962635]
We develop a topic-informed discrete latent variable model for semantic textual similarity.
Our model learns a shared latent space for sentence-pair representation via vector quantization.
We show that our model is able to surpass several strong neural baselines in semantic textual similarity tasks.
arXiv Detail & Related papers (2022-11-07T15:09:58Z) - The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures.
We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities.
For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z) - Quantised Transforming Auto-Encoders: Achieving Equivariance to
Arbitrary Transformations in Deep Networks [23.673155102696338]
Convolutional Neural Networks (CNNs) are equivariant to image translation.
We propose an auto-encoder architecture whose embedding obeys an arbitrary set of equivariance relations simultaneously.
We demonstrate results of successful re-rendering of transformed versions of input images on several datasets.
arXiv Detail & Related papers (2021-11-25T02:26:38Z) - Topographic VAEs learn Equivariant Capsules [84.33745072274942]
We introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables.
We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST.
We demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.
arXiv Detail & Related papers (2021-09-03T09:25:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.