Pixel-Grounded Prototypical Part Networks
- URL: http://arxiv.org/abs/2309.14531v1
- Date: Mon, 25 Sep 2023 21:09:49 GMT
- Title: Pixel-Grounded Prototypical Part Networks
- Authors: Zachariah Carmichael, Suhas Lohit, Anoop Cherian, Michael Jones,
Walter Scheirer
- Abstract summary: Prototypical part neural networks (ProtoPartNNs) are an intrinsically interpretable approach to machine learning.
We argue that detraction from these underlying issues is due to the alluring nature of visualizations and an over-reliance on intuition.
We propose new receptive field-based architectural constraints for meaningful localization and a principled pixel space mapping for ProtoPartNNs.
- Score: 33.408034817820834
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prototypical part neural networks (ProtoPartNNs), namely PROTOPNET and its
derivatives, are an intrinsically interpretable approach to machine learning.
Their prototype learning scheme enables intuitive explanations of the form,
this (prototype) looks like that (testing image patch). But, does this actually
look like that? In this work, we delve into why object part localization and
associated heat maps in past work are misleading. Rather than localizing to
object parts, existing ProtoPartNNs localize to the entire image, contrary to
generated explanatory visualizations. We argue that detraction from these
underlying issues is due to the alluring nature of visualizations and an
over-reliance on intuition. To alleviate these issues, we devise new receptive
field-based architectural constraints for meaningful localization and a
principled pixel space mapping for ProtoPartNNs. To improve interpretability,
we propose additional architectural improvements, including a simplified
classification head. We also make additional corrections to PROTOPNET and its
derivatives, such as the use of a validation set, rather than a test set, to
evaluate generalization during training. Our approach, PIXPNET (Pixel-grounded
Prototypical part Network), is the only ProtoPartNN that truly learns and
localizes to prototypical object parts. We demonstrate that PIXPNET achieves
quantifiably improved interpretability without sacrificing accuracy.
Related papers
- Enhanced Prototypical Part Network (EPPNet) For Explainable Image Classification Via Prototypes [16.528373143163275]
We introduce the Enhanced Prototypical Part Network (EPPNet) for image classification.
EPPNet achieves strong performance while discovering relevant prototypes that can be used to explain the classification results.
Our evaluations on the CUB-200-2011 dataset show that the EPPNet outperforms state-of-the-art xAI-based methods.
arXiv Detail & Related papers (2024-08-08T17:26:56Z) - This actually looks like that: Proto-BagNets for local and global interpretability-by-design [5.037593461859481]
Interpretability is a key requirement for the use of machine learning models in high-stakes applications.
We introduce Proto-BagNets, an interpretable-by-design prototype-based model.
Proto-BagNet provides faithful, accurate, and clinically meaningful local and global explanations.
arXiv Detail & Related papers (2024-06-21T14:12:15Z) - PDiscoNet: Semantically consistent part discovery for fine-grained
recognition [62.12602920807109]
We propose PDiscoNet to discover object parts by using only image-level class labels along with priors encouraging the parts to be.
Our results on CUB, CelebA, and PartImageNet show that the proposed method provides substantially better part discovery performance than previous methods.
arXiv Detail & Related papers (2023-09-06T17:19:29Z) - Improving Prototypical Visual Explanations with Reward Reweighing, Reselection, and Retraining [4.9572582709699144]
ProtoPNet attempts to classify images based on meaningful parts of the input.
While this architecture is able to produce visually interpretable classifications, it often learns to classify based on parts of the image that are not semantically meaningful.
We propose the Reward Reweighing, Reselecting, and Retraining (R3) post-processing framework, which performs three additional corrective updates to a pretrained ProtoPNet.
arXiv Detail & Related papers (2023-07-08T03:42:54Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers
for Interpretable Image Recognition [32.34322644235324]
Prototypical part network (ProtoPNet) has drawn wide attention and boosted many follow-up studies due to its self-explanatory property for explainable artificial intelligence (XAI)
When directly applying ProtoPNet on vision transformer (ViT) backbones, learned prototypes have a relatively high probability of being activated by the background and pay less attention to the foreground.
This paper proposes prototypical part transformer (ProtoPFormer) for appropriately and effectively applying the prototype-based method with ViTs for interpretable image recognition.
arXiv Detail & Related papers (2022-08-22T16:36:32Z) - Fair Interpretable Learning via Correction Vectors [68.29997072804537]
We propose a new framework for fair representation learning centered around the learning of "correction vectors"
The corrections are then simply summed up to the original features, and can therefore be analyzed as an explicit penalty or bonus to each feature.
We show experimentally that a fair representation learning problem constrained in such a way does not impact performance.
arXiv Detail & Related papers (2022-01-17T10:59:33Z) - Unsupervised Part Discovery from Contrastive Reconstruction [90.88501867321573]
The goal of self-supervised visual representation learning is to learn strong, transferable image representations.
We propose an unsupervised approach to object part discovery and segmentation.
Our method yields semantic parts consistent across fine-grained but visually distinct categories.
arXiv Detail & Related papers (2021-11-11T17:59:42Z) - Towards Efficient Scene Understanding via Squeeze Reasoning [71.1139549949694]
We propose a novel framework called Squeeze Reasoning.
Instead of propagating information on the spatial map, we first learn to squeeze the input feature into a channel-wise global vector.
We show that our approach can be modularized as an end-to-end trained block and can be easily plugged into existing networks.
arXiv Detail & Related papers (2020-11-06T12:17:01Z) - LoCo: Local Contrastive Representation Learning [93.98029899866866]
We show that by overlapping local blocks stacking on top of each other, we effectively increase the decoder depth and allow upper blocks to implicitly send feedbacks to lower blocks.
This simple design closes the performance gap between local learning and end-to-end contrastive learning algorithms for the first time.
arXiv Detail & Related papers (2020-08-04T05:41:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.