HyperPg -- Prototypical Gaussians on the Hypersphere for Interpretable Deep Learning
- URL: http://arxiv.org/abs/2410.08925v1
- Date: Fri, 11 Oct 2024 15:50:31 GMT
- Title: HyperPg -- Prototypical Gaussians on the Hypersphere for Interpretable Deep Learning
- Authors: Maximilian Xiling Li, Korbinian Franz Rudolf, Nils Blank, Rudolf Lioutikov,
- Abstract summary: ProtoPNet learn, which part of a test image "look like" known prototypical parts from training images, combines predictive power with the inherent interpretability of case-based reasoning.
This work introduces HyperPg, a new prototype representation leveraging Gaussian distributions on a hypersphere in latent space.
experiments on CUB-200-2011 and Stanford Cars datasets demonstrate that HyperPgNet outperforms other prototype learning architectures.
- Score: 2.0599237172837523
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prototype Learning methods provide an interpretable alternative to black-box deep learning models. Approaches such as ProtoPNet learn, which part of a test image "look like" known prototypical parts from training images, combining predictive power with the inherent interpretability of case-based reasoning. However, existing approaches have two main drawbacks: A) They rely solely on deterministic similarity scores without statistical confidence. B) The prototypes are learned in a black-box manner without human input. This work introduces HyperPg, a new prototype representation leveraging Gaussian distributions on a hypersphere in latent space, with learnable mean and variance. HyperPg prototypes adapt to the spread of clusters in the latent space and output likelihood scores. The new architecture, HyperPgNet, leverages HyperPg to learn prototypes aligned with human concepts from pixel-level annotations. Consequently, each prototype represents a specific concept such as color, image texture, or part of the image subject. A concept extraction pipeline built on foundation models provides pixel-level annotations, significantly reducing human labeling effort. Experiments on CUB-200-2011 and Stanford Cars datasets demonstrate that HyperPgNet outperforms other prototype learning architectures while using fewer parameters and training steps. Additionally, the concept-aligned HyperPg prototypes are learned transparently, enhancing model interpretability.
Related papers
- GAProtoNet: A Multi-head Graph Attention-based Prototypical Network for Interpretable Text Classification [1.170190320889319]
We introduce GAProtoNet, a novel white-box Multi-head Graph Attention-based Prototypical Network.
Our approach achieves superior results without sacrificing the accuracy of the original black-box LMs.
Our case study and visualization of prototype clusters also demonstrate the efficiency in explaining the decisions of black-box models built with LMs.
arXiv Detail & Related papers (2024-09-20T08:15:17Z) - LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging [65.72891334156706]
We introduce Label-Combination Prototypical Networks (LC-Protonets) to address the problem of multi-label few-shot classification.
LC-Protonets generate one prototype per label combination, derived from the power set of labels present in the limited training items.
Our method is applied to automatic audio tagging across diverse music datasets, covering various cultures and including both modern and traditional music.
arXiv Detail & Related papers (2024-09-17T15:13:07Z) - InfoDisent: Explainability of Image Classification Models by Information Disentanglement [10.89767277352967]
We introduce InfoDisent, a hybrid approach to explainability based on the information bottleneck principle.
We demonstrate the effectiveness of InfoDisent through computational experiments and user studies across various datasets.
arXiv Detail & Related papers (2024-09-16T14:39:15Z) - Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation [7.372346036256517]
Prototypical part learning is emerging as a promising approach for making semantic segmentation interpretable.
We propose a method for interpretable semantic segmentation that leverages multi-scale image representation for prototypical part learning.
Experiments conducted on Pascal VOC, Cityscapes, and ADE20K demonstrate that the proposed method increases model sparsity, improves interpretability over existing prototype-based methods, and narrows the performance gap with the non-interpretable counterpart models.
arXiv Detail & Related papers (2024-09-14T17:52:59Z) - Predefined Prototypes for Intra-Class Separation and Disentanglement [10.005120138175206]
Prototypical Learning is based on the idea that there is a point (which we call prototype) around which the embeddings of a class are clustered.
We propose to predefine prototypes following human-specified criteria, which simplify the training pipeline and brings different advantages.
arXiv Detail & Related papers (2024-06-23T15:52:23Z) - Automatic Discovery of Visual Circuits [66.99553804855931]
We explore scalable methods for extracting the subgraph of a vision model's computational graph that underlies recognition of a specific visual concept.
We find that our approach extracts circuits that causally affect model output, and that editing these circuits can defend large pretrained models from adversarial attacks.
arXiv Detail & Related papers (2024-04-22T17:00:57Z) - FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models [56.71672127740099]
We focus on the task of image segmentation, which is traditionally solved by training models on closed-vocabulary datasets.
We leverage different and relatively small-sized, open-source foundation models for zero-shot open-vocabulary segmentation.
Our approach (dubbed FreeSeg-Diff), which does not rely on any training, outperforms many training-based approaches on both Pascal VOC and COCO datasets.
arXiv Detail & Related papers (2024-03-29T10:38:25Z) - Unicom: Universal and Compact Representation Learning for Image
Retrieval [65.96296089560421]
We cluster the large-scale LAION400M into one million pseudo classes based on the joint textual and visual features extracted by the CLIP model.
To alleviate such conflict, we randomly select partial inter-class prototypes to construct the margin-based softmax loss.
Our method significantly outperforms state-of-the-art unsupervised and supervised image retrieval approaches on multiple benchmarks.
arXiv Detail & Related papers (2023-04-12T14:25:52Z) - A Closer Look at Few-shot Classification Again [68.44963578735877]
Few-shot classification consists of a training phase and an adaptation phase.
We empirically prove that the training algorithm and the adaptation algorithm can be completely disentangled.
Our meta-analysis for each phase reveals several interesting insights that may help better understand key aspects of few-shot classification.
arXiv Detail & Related papers (2023-01-28T16:42:05Z) - Multimodal Prototype-Enhanced Network for Few-Shot Action Recognition [40.329190454146996]
MultimOdal PRototype-ENhanced Network (MORN) uses semantic information of label texts as multimodal information to enhance prototypes.
We conduct extensive experiments on four popular few-shot action recognition datasets.
arXiv Detail & Related papers (2022-12-09T14:24:39Z) - Sketch-Guided Text-to-Image Diffusion Models [57.12095262189362]
We introduce a universal approach to guide a pretrained text-to-image diffusion model.
Our method does not require to train a dedicated model or a specialized encoder for the task.
We take a particular focus on the sketch-to-image translation task, revealing a robust and expressive way to generate images.
arXiv Detail & Related papers (2022-11-24T18:45:32Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Rethinking Semantic Segmentation: A Prototype View [126.59244185849838]
We present a nonparametric semantic segmentation model based on non-learnable prototypes.
Our framework yields compelling results over several datasets.
We expect this work will provoke a rethink of the current de facto semantic segmentation model design.
arXiv Detail & Related papers (2022-03-28T21:15:32Z) - Interpretable Image Classification with Differentiable Prototypes
Assignment [7.660883761395447]
We introduce ProtoPool, an interpretable image classification model with a pool of prototypes shared by the classes.
It is obtained by introducing a fully differentiable assignment of prototypes to particular classes.
We show that ProtoPool obtains state-of-the-art accuracy on the CUB-200-2011 and the Stanford Cars datasets, substantially reducing the number of prototypes.
arXiv Detail & Related papers (2021-12-06T10:03:32Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - Hyperbolic Busemann Learning with Ideal Prototypes [14.525985704735055]
In this work, we propose Hyperbolic Busemann Learning for representation learning of arbitrary data.
To be able to compute proximities to ideal prototypes, we introduce the penalised Busemann loss.
Empirically, we show that our approach provides a natural interpretation of classification confidence, while outperforming recent hyperspherical and hyperbolic prototype approaches.
arXiv Detail & Related papers (2021-06-28T08:36:59Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.