ProtoS-ViT: Visual foundation models for sparse self-explainable classifications
- URL: http://arxiv.org/abs/2406.10025v1
- Date: Fri, 14 Jun 2024 13:36:30 GMT
- Title: ProtoS-ViT: Visual foundation models for sparse self-explainable classifications
- Authors: Hugues Turbé, Mina Bjelogrlic, Gianmarco Mengaldo, Christian Lovis,
- Abstract summary: This work demonstrates how frozen pre-trained ViT backbones can be effectively turned into prototypical models.
ProtoS-ViT surpasses existing prototypical models showing strong performance in terms of accuracy, compactness, and explainability.
- Score: 0.6249768559720122
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Prototypical networks aim to build intrinsically explainable models based on the linear summation of concepts. However, important challenges remain in the transparency, compactness, and meaningfulness of the explanations provided by these models. This work demonstrates how frozen pre-trained ViT backbones can be effectively turned into prototypical models for both general and domain-specific tasks, in our case biomedical image classifiers. By leveraging strong spatial features combined with a novel prototypical head, ProtoS-ViT surpasses existing prototypical models showing strong performance in terms of accuracy, compactness, and explainability. Model explainability is evaluated through an extensive set of quantitative and qualitative metrics which serve as a general benchmark for the development of prototypical models. Code is available at https://github.com/hturbe/protosvit.
Related papers
- Few-Shot Medical Image Segmentation with High-Fidelity Prototypes [38.073371773707514]
We propose a novel Detail Self-refined Prototype Network (DSPNet) to construct high-fidelity prototypes representing the object foreground and the background more comprehensively.
To construct global semantics while maintaining the captured detail semantics, we learn the foreground prototypes by modelling the multi-modal structures with clustering and then fusing each in a channel-wise manner.
arXiv Detail & Related papers (2024-06-26T05:06:14Z) - This Looks Better than That: Better Interpretable Models with ProtoPNeXt [14.28283868577614]
Prototypical-part models are a popular interpretable alternative to black-box deep learning models for computer vision.
We create a new framework for integrating components of prototypical-part models -- ProtoPNeXt.
arXiv Detail & Related papers (2024-06-20T18:54:27Z) - DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception [66.88792390480343]
We propose DEEM, a simple and effective approach that utilizes the generative feedback of diffusion models to align the semantic distributions of the image encoder.
DEEM exhibits enhanced robustness and a superior capacity to alleviate hallucinations while utilizing fewer trainable parameters, less pre-training data, and a smaller base model size.
arXiv Detail & Related papers (2024-05-24T05:46:04Z) - Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Prototype Learning for Explainable Brain Age Prediction [1.104960878651584]
We present ExPeRT, an explainable prototype-based model specifically designed for regression tasks.
Our proposed model makes a sample prediction from the distances to a set of learned prototypes in latent space, using a weighted mean of prototype labels.
Our approach achieved state-of-the-art prediction performance while providing insight into the model's reasoning process.
arXiv Detail & Related papers (2023-06-16T14:13:21Z) - Representer Point Selection for Explaining Regularized High-dimensional
Models [105.75758452952357]
We introduce a class of sample-based explanations we term high-dimensional representers.
Our workhorse is a novel representer theorem for general regularized high-dimensional models.
We study the empirical performance of our proposed methods on three real-world binary classification datasets and two recommender system datasets.
arXiv Detail & Related papers (2023-05-31T16:23:58Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model [18.537838366377915]
ProtoVAE is a variational autoencoder-based framework that learns class-specific prototypes in an end-to-end manner.
It enforces trustworthiness and diversity by regularizing the representation space and introducing an orthonormality constraint.
arXiv Detail & Related papers (2022-10-15T00:42:13Z) - IterMiUnet: A lightweight architecture for automatic blood vessel
segmentation [10.538564380139483]
This paper proposes IterMiUnet, a new lightweight convolution-based segmentation model.
It overcomes its heavily parametrized nature by incorporating the encoder-decoder structure of MiUnet model within it.
The proposed model has a lot of potential to be utilized as a tool for the early diagnosis of many diseases.
arXiv Detail & Related papers (2022-08-02T14:33:14Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [55.28436972267793]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.