ProtoS-ViT: Visual foundation models for sparse self-explainable classifications
- URL: http://arxiv.org/abs/2406.10025v1
- Date: Fri, 14 Jun 2024 13:36:30 GMT
- Title: ProtoS-ViT: Visual foundation models for sparse self-explainable classifications
- Authors: Hugues Turbé, Mina Bjelogrlic, Gianmarco Mengaldo, Christian Lovis,
- Abstract summary: This work demonstrates how frozen pre-trained ViT backbones can be effectively turned into prototypical models.
ProtoS-ViT surpasses existing prototypical models showing strong performance in terms of accuracy, compactness, and explainability.
- Score: 0.6249768559720122
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Prototypical networks aim to build intrinsically explainable models based on the linear summation of concepts. However, important challenges remain in the transparency, compactness, and meaningfulness of the explanations provided by these models. This work demonstrates how frozen pre-trained ViT backbones can be effectively turned into prototypical models for both general and domain-specific tasks, in our case biomedical image classifiers. By leveraging strong spatial features combined with a novel prototypical head, ProtoS-ViT surpasses existing prototypical models showing strong performance in terms of accuracy, compactness, and explainability. Model explainability is evaluated through an extensive set of quantitative and qualitative metrics which serve as a general benchmark for the development of prototypical models. Code is available at https://github.com/hturbe/protosvit.
Related papers
- Free Lunch in Pathology Foundation Model: Task-specific Model Adaptation with Concept-Guided Feature Enhancement [18.839406725114042]
We present Concept Anchor-guided Task-specific Feature Enhancement (CATE)
CATE can boost the expressivity and discriminativeness of pathology foundation models for specific downstream tasks.
Experiments on public WSI datasets demonstrate that CATE significantly enhances the performance and generalizability of MIL models.
arXiv Detail & Related papers (2024-11-15T02:38:00Z) - Interpretable Image Classification with Adaptive Prototype-based Vision Transformers [37.62530032165594]
We present ProtoViT, a method for interpretable image classification combining deep learning and case-based reasoning.
Our model integrates Vision Transformer (ViT) backbones into prototype based models, while offering spatially deformed prototypes.
Our experiments show that our model can generally achieve higher performance than the existing prototype based models.
arXiv Detail & Related papers (2024-10-28T04:33:28Z) - Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Prototype Learning for Explainable Brain Age Prediction [1.104960878651584]
We present ExPeRT, an explainable prototype-based model specifically designed for regression tasks.
Our proposed model makes a sample prediction from the distances to a set of learned prototypes in latent space, using a weighted mean of prototype labels.
Our approach achieved state-of-the-art prediction performance while providing insight into the model's reasoning process.
arXiv Detail & Related papers (2023-06-16T14:13:21Z) - Representer Point Selection for Explaining Regularized High-dimensional
Models [105.75758452952357]
We introduce a class of sample-based explanations we term high-dimensional representers.
Our workhorse is a novel representer theorem for general regularized high-dimensional models.
We study the empirical performance of our proposed methods on three real-world binary classification datasets and two recommender system datasets.
arXiv Detail & Related papers (2023-05-31T16:23:58Z) - Investigating Ensemble Methods for Model Robustness Improvement of Text
Classifiers [66.36045164286854]
We analyze a set of existing bias features and demonstrate there is no single model that works best for all the cases.
By choosing an appropriate bias model, we can obtain a better robustness result than baselines with a more sophisticated model design.
arXiv Detail & Related papers (2022-10-28T17:52:10Z) - ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model [18.537838366377915]
ProtoVAE is a variational autoencoder-based framework that learns class-specific prototypes in an end-to-end manner.
It enforces trustworthiness and diversity by regularizing the representation space and introducing an orthonormality constraint.
arXiv Detail & Related papers (2022-10-15T00:42:13Z) - IterMiUnet: A lightweight architecture for automatic blood vessel
segmentation [10.538564380139483]
This paper proposes IterMiUnet, a new lightweight convolution-based segmentation model.
It overcomes its heavily parametrized nature by incorporating the encoder-decoder structure of MiUnet model within it.
The proposed model has a lot of potential to be utilized as a tool for the early diagnosis of many diseases.
arXiv Detail & Related papers (2022-08-02T14:33:14Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.