Soft-CAM: Making black box models self-explainable for high-stakes decisions
- URL: http://arxiv.org/abs/2505.17748v1
- Date: Fri, 23 May 2025 11:15:21 GMT
- Title: Soft-CAM: Making black box models self-explainable for high-stakes decisions
- Authors: Kerol Djoumessi, Philipp Berens,
- Abstract summary: Convolutional neural networks (CNNs) are widely used for high-stakes applications like medicine, often surpassing human performance.<n>Most explanation methods rely on post-hoc attribution, approximating the decision-making process of already trained black-box models.<n>We introduce SoftCAM, a straightforward yet effective approach that makes standard CNN architectures inherently interpretable.
- Score: 6.635611625764804
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convolutional neural networks (CNNs) are widely used for high-stakes applications like medicine, often surpassing human performance. However, most explanation methods rely on post-hoc attribution, approximating the decision-making process of already trained black-box models. These methods are often sensitive, unreliable, and fail to reflect true model reasoning, limiting their trustworthiness in critical applications. In this work, we introduce SoftCAM, a straightforward yet effective approach that makes standard CNN architectures inherently interpretable. By removing the global average pooling layer and replacing the fully connected classification layer with a convolution-based class evidence layer, SoftCAM preserves spatial information and produces explicit class activation maps that form the basis of the model's predictions. Evaluated on three medical datasets, SoftCAM maintains classification performance while significantly improving both the qualitative and quantitative explanation compared to existing post-hoc methods. Our results demonstrate that CNNs can be inherently interpretable without compromising performance, advancing the development of self-explainable deep learning for high-stakes decision-making.
Related papers
- AdaCBM: An Adaptive Concept Bottleneck Model for Explainable and Accurate Diagnosis [38.16978432272716]
The integration of vision-language models such as CLIP and Concept Bottleneck Models (CBMs) offers a promising approach to explaining deep neural network (DNN) decisions.
While CLIP provides both explainability and zero-shot classification capability, its pre-training on generic image and text data may limit its classification accuracy and applicability to medical image diagnostic tasks.
This paper takes an unconventional approach by re-examining the CBM framework through the lens of its geometrical representation as a simple linear classification system.
arXiv Detail & Related papers (2024-08-04T11:59:09Z) - FM-G-CAM: A Holistic Approach for Explainable AI in Computer Vision [0.6215404942415159]
We emphasise the need to understand predictions of Computer Vision models, specifically Convolutional Neural Network (CNN) based models.
Existing methods of explaining CNN predictions are mostly based on Gradient-weighted Class Activation Maps (Grad-CAM) and solely focus on a single target class.
We present an exhaustive methodology called Fused Multi-class Gradient-weighted Class Activation Map (FM-G-CAM) that considers multiple top predicted classes.
arXiv Detail & Related papers (2023-12-10T19:33:40Z) - BroadCAM: Outcome-agnostic Class Activation Mapping for Small-scale
Weakly Supervised Applications [69.22739434619531]
We propose an outcome-agnostic CAM approach, called BroadCAM, for small-scale weakly supervised applications.
By evaluating BroadCAM on VOC2012 and BCSS-WSSS for WSSS and OpenImages30k for WSOL, BroadCAM demonstrates superior performance.
arXiv Detail & Related papers (2023-09-07T06:45:43Z) - Boosting Low-Data Instance Segmentation by Unsupervised Pre-training
with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes.
Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models.
Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z) - Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and
Treating CNN Classifiers [33.82339346293966]
Convolutional Neural Network (CNN) has achieved excellent performance in the classification task.
It is widely known that CNN is deemed as a 'black-box', which is hard for understanding the prediction mechanism.
We propose the first completely automatic model diagnosing and treating tool, termed as Model Doctor.
arXiv Detail & Related papers (2021-12-09T14:05:00Z) - Localized Perturbations For Weakly-Supervised Segmentation of Glioma
Brain Tumours [0.5801621787540266]
This work proposes the use of localized perturbations as a weakly-supervised solution to extract segmentation masks of brain tumours from a pretrained 3D classification model.
We also propose a novel optimal perturbation method that exploits 3D superpixels to find the most relevant area for a given classification using a U-net architecture.
arXiv Detail & Related papers (2021-11-29T21:01:20Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Eigen-CAM: Class Activation Map using Principal Components [1.2691047660244335]
This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models.
The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers.
arXiv Detail & Related papers (2020-08-01T17:14:13Z) - Interpretable Learning-to-Rank with Generalized Additive Models [78.42800966500374]
Interpretability of learning-to-rank models is a crucial yet relatively under-examined research area.
Recent progress on interpretable ranking models largely focuses on generating post-hoc explanations for existing black-box ranking models.
We lay the groundwork for intrinsically interpretable learning-to-rank by introducing generalized additive models (GAMs) into ranking tasks.
arXiv Detail & Related papers (2020-05-06T01:51:30Z) - Rectified Meta-Learning from Noisy Labels for Robust Image-based Plant
Disease Diagnosis [64.82680813427054]
Plant diseases serve as one of main threats to food security and crop production.
One popular approach is to transform this problem as a leaf image classification task, which can be addressed by the powerful convolutional neural networks (CNNs)
We propose a novel framework that incorporates rectified meta-learning module into common CNN paradigm to train a noise-robust deep network without using extra supervision information.
arXiv Detail & Related papers (2020-03-17T09:51:30Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.