ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model
- URL: http://arxiv.org/abs/2210.08151v1
- Date: Sat, 15 Oct 2022 00:42:13 GMT
- Title: ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model
- Authors: Srishti Gautam, Ahcene Boubekki, Stine Hansen, Suaiba Amina
Salahuddin, Robert Jenssen, Marina MC H\"ohne, Michael Kampffmeyer
- Abstract summary: ProtoVAE is a variational autoencoder-based framework that learns class-specific prototypes in an end-to-end manner.
It enforces trustworthiness and diversity by regularizing the representation space and introducing an orthonormality constraint.
- Score: 18.537838366377915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The need for interpretable models has fostered the development of
self-explainable classifiers. Prior approaches are either based on multi-stage
optimization schemes, impacting the predictive performance of the model, or
produce explanations that are not transparent, trustworthy or do not capture
the diversity of the data. To address these shortcomings, we propose ProtoVAE,
a variational autoencoder-based framework that learns class-specific prototypes
in an end-to-end manner and enforces trustworthiness and diversity by
regularizing the representation space and introducing an orthonormality
constraint. Finally, the model is designed to be transparent by directly
incorporating the prototypes into the decision process. Extensive comparisons
with previous self-explainable approaches demonstrate the superiority of
ProtoVAE, highlighting its ability to generate trustworthy and diverse
explanations, while not degrading predictive performance.
Related papers
- Interpret the Internal States of Recommendation Model with Sparse Autoencoder [26.021277330699963]
RecSAE is an automatic, generalizable probing method for interpreting the internal states of Recommendation models.
We train an autoencoder with sparsity constraints to reconstruct internal activations of recommendation models.
We automated the construction of concept dictionaries based on the relationship between latent activations and input item sequences.
arXiv Detail & Related papers (2024-11-09T08:22:31Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - Uncertainty-Aware Explanations Through Probabilistic Self-Explainable Neural Networks [17.238290206236027]
Prototype-Based Self-Explainable Neural Networks offer deep, yet transparent-by-design architecture.
We introduce a probabilistic reformulation of PSENNs, called Prob-PSENN, which replaces point estimates for the prototypes with probability distributions over their values.
Our experiments demonstrate that Prob-PSENNs provide more meaningful and robust explanations than their non-probabilistic counterparts.
arXiv Detail & Related papers (2024-03-20T16:47:28Z) - Bridging Generative and Discriminative Models for Unified Visual
Perception with Diffusion Priors [56.82596340418697]
We propose a simple yet effective framework comprising a pre-trained Stable Diffusion (SD) model containing rich generative priors, a unified head (U-head) capable of integrating hierarchical representations, and an adapted expert providing discriminative priors.
Comprehensive investigations unveil potential characteristics of Vermouth, such as varying granularity of perception concealed in latent variables at distinct time steps and various U-net stages.
The promising results demonstrate the potential of diffusion models as formidable learners, establishing their significance in furnishing informative and robust visual representations.
arXiv Detail & Related papers (2024-01-29T10:36:57Z) - Learning Transferable Conceptual Prototypes for Interpretable
Unsupervised Domain Adaptation [79.22678026708134]
In this paper, we propose an inherently interpretable method, named Transferable Prototype Learning ( TCPL)
To achieve this goal, we design a hierarchically prototypical module that transfers categorical basic concepts from the source domain to the target domain and learns domain-shared prototypes for explaining the underlying reasoning process.
Comprehensive experiments show that the proposed method can not only provide effective and intuitive explanations but also outperform previous state-of-the-arts.
arXiv Detail & Related papers (2023-10-12T06:36:41Z) - Prototype-based Aleatoric Uncertainty Quantification for Cross-modal
Retrieval [139.21955930418815]
Cross-modal Retrieval methods build similarity relations between vision and language modalities by jointly learning a common representation space.
However, the predictions are often unreliable due to the Aleatoric uncertainty, which is induced by low-quality data, e.g., corrupt images, fast-paced videos, and non-detailed texts.
We propose a novel Prototype-based Aleatoric Uncertainty Quantification (PAU) framework to provide trustworthy predictions by quantifying the uncertainty arisen from the inherent data ambiguity.
arXiv Detail & Related papers (2023-09-29T09:41:19Z) - Explaining Language Models' Predictions with High-Impact Concepts [11.47612457613113]
We propose a complete framework for extending concept-based interpretability methods to NLP.
We optimize for features whose existence causes the output predictions to change substantially.
Our method achieves superior results on predictive impact, usability, and faithfulness compared to the baselines.
arXiv Detail & Related papers (2023-05-03T14:48:27Z) - Learning to Select Prototypical Parts for Interpretable Sequential Data
Modeling [7.376829794171344]
We propose a Self-Explaining Selective Model (SESM) that uses a linear combination of prototypical concepts to explain its own predictions.
For better interpretability, we design multiple constraints including diversity, stability, and locality as training objectives.
arXiv Detail & Related papers (2022-12-07T01:42:47Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Attentional Prototype Inference for Few-Shot Segmentation [128.45753577331422]
We propose attentional prototype inference (API), a probabilistic latent variable framework for few-shot segmentation.
We define a global latent variable to represent the prototype of each object category, which we model as a probabilistic distribution.
We conduct extensive experiments on four benchmarks, where our proposal obtains at least competitive and often better performance than state-of-the-art prototype-based methods.
arXiv Detail & Related papers (2021-05-14T06:58:44Z) - An Identifiable Double VAE For Disentangled Representations [24.963285614606665]
We propose a novel VAE-based generative model with theoretical guarantees on identifiability.
We obtain our conditional prior over the latents by learning an optimal representation.
Experimental results indicate superior performance with respect to state-of-the-art approaches.
arXiv Detail & Related papers (2020-10-19T09:59:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.