Related papers: Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models

Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models

URL: http://arxiv.org/abs/2505.07209v1
Date: Mon, 12 May 2025 03:31:57 GMT
Title: Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models
Authors: Yan Xie, Zequn Zeng, Hao Zhang, Yucheng Ding, Yi Wang, Zhengjue Wang, Bo Chen, Hongwei Liu,
Abstract summary: Concept Bottleneck Models (CBMs) try to make the decision-making process transparent by exploring an intermediate concept space between the input image and the output prediction.<n>Existing CBMs just learn coarse-grained relations between the whole image and the concepts, less considering local image information.<n>This paper proposes a Disentangled Optimal Transport CBM framework to explore fine-grained visual-concept relations between local image patches and concepts.
Score: 16.617257464664572
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Concept Bottleneck Models (CBMs) try to make the decision-making process transparent by exploring an intermediate concept space between the input image and the output prediction. Existing CBMs just learn coarse-grained relations between the whole image and the concepts, less considering local image information, leading to two main drawbacks: i) they often produce spurious visual-concept relations, hence decreasing model reliability; and ii) though CBMs could explain the importance of every concept to the final prediction, it is still challenging to tell which visual region produces the prediction. To solve these problems, this paper proposes a Disentangled Optimal Transport CBM (DOT-CBM) framework to explore fine-grained visual-concept relations between local image patches and concepts. Specifically, we model the concept prediction process as a transportation problem between the patches and concepts, thereby achieving explicit fine-grained feature alignment. We also incorporate orthogonal projection losses within the modality to enhance local feature disentanglement. To further address the shortcut issues caused by statistical biases in the data, we utilize the visual saliency map and concept label statistics as transportation priors. Thus, DOT-CBM can visualize inversion heatmaps, provide more reliable concept predictions, and produce more accurate class predictions. Comprehensive experiments demonstrate that our proposed DOT-CBM achieves SOTA performance on several tasks, including image classification, local part detection and out-of-distribution generalization.

Related papers

Interpretable Few-Shot Image Classification via Prototypical Concept-Guided Mixture of LoRA Experts [79.18608192761512]
Self-Explainable Models (SEMs) rely on Prototypical Concept Learning (PCL) to enable their visual recognition processes more interpretable.<n>We propose a Few-Shot Prototypical Concept Classification framework that mitigates two key challenges under low-data regimes: parametric imbalance and representation misalignment.<n>Our approach consistently outperforms existing SEMs by a notable margin, with 4.2%-8.7% relative gains in 5-way 5-shot classification.
arXiv Detail & Related papers (2025-06-05T06:39:43Z)
DCBM: Data-Efficient Visual Concept Bottleneck Models [13.36057999450821]
Concept Bottleneck Models (CBMs) enhance interpretability of neural networks by basing predictions on human-understandable concepts.<n>We propose Data-efficient CBMs, which reduce the need for large sample sizes during concept generation while preserving interpretability.
arXiv Detail & Related papers (2024-12-16T09:04:58Z)
MulCPred: Learning Multi-modal Concepts for Explainable Pedestrian Action Prediction [57.483718822429346]
MulCPred is proposed that explains its predictions based on multi-modal concepts represented by training samples. MulCPred is evaluated on multiple datasets and tasks.
arXiv Detail & Related papers (2024-09-14T14:15:28Z)
Improving Intervention Efficacy via Concept Realignment in Concept Bottleneck Models [57.86303579812877]
Concept Bottleneck Models (CBMs) ground image classification on human-understandable concepts to allow for interpretable model decisions. Existing approaches often require numerous human interventions per image to achieve strong performances. We introduce a trainable concept realignment intervention module, which leverages concept relations to realign concept assignments post-intervention.
arXiv Detail & Related papers (2024-05-02T17:59:01Z)
On the Concept Trustworthiness in Concept Bottleneck Models [39.928868605678744]
Concept Bottleneck Models (CBMs) break down the reasoning process into the input-to-concept mapping and the concept-to-label prediction. Despite the transparency of the concept-to-label prediction, the mapping from the input to the intermediate concept remains a black box. A pioneering metric, referred to as concept trustworthiness score, is proposed to gauge whether the concepts are derived from relevant regions. An enhanced CBM is introduced, enabling concept predictions to be made specifically from distinct parts of the feature map.
arXiv Detail & Related papers (2024-03-21T12:24:53Z)
Eliminating Information Leakage in Hard Concept Bottleneck Models with Supervised, Hierarchical Concept Learning [17.982131928413096]
Concept Bottleneck Models (CBMs) aim to deliver interpretable and interventionable predictions by bridging features and labels with human-understandable concepts. CBMs suffer from information leakage, where unintended information beyond the concepts are leaked to the subsequent label prediction. This paper proposes a new paradigm of CBMs, namely SupCBM, which achieves label predication via predicted concepts and a deliberately-designed intervention matrix.
arXiv Detail & Related papers (2024-02-03T03:50:58Z)
Exploiting Diffusion Prior for Generalizable Dense Prediction [85.4563592053464]
Recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate. We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks. Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-30T18:59:44Z)
Auxiliary Losses for Learning Generalizable Concept-based Models [5.4066453042367435]
Concept Bottleneck Models (CBMs) have gained popularity since their introduction. CBMs essentially limit the latent space of a model to human-understandable high-level concepts. We propose cooperative-Concept Bottleneck Model (coop-CBM) to overcome the performance trade-off.
arXiv Detail & Related papers (2023-11-18T15:50:07Z)
Explainable fetal ultrasound quality assessment with progressive concept bottleneck models [6.734637459963132]
We propose a holistic and explainable method for fetal ultrasound quality assessment.<n>We introduce human-readable concepts" into the task and imitate the sequential expert decision-making process.<n> Experiments show that our model outperforms equivalent concept-free models on an in-house dataset.
arXiv Detail & Related papers (2022-11-19T09:31:19Z)
Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view. Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z)
Post-hoc Concept Bottleneck Models [11.358495577593441]
Concept Bottleneck Models (CBMs) map the inputs onto a set of interpretable concepts and use the concepts to make predictions. CBMs are restrictive in practice as they require concept labels in the training data to learn the bottleneck and do not leverage strong pretrained models. We show that we can turn any neural network into a PCBM without sacrificing model performance while still retaining interpretability benefits.
arXiv Detail & Related papers (2022-05-31T00:29:26Z)
PDC-Net+: Enhanced Probabilistic Dense Correspondence Network [161.76275845530964]
Enhanced Probabilistic Dense Correspondence Network, PDC-Net+, capable of estimating accurate dense correspondences. We develop an architecture and an enhanced training strategy tailored for robust and generalizable uncertainty prediction. Our approach obtains state-of-the-art results on multiple challenging geometric matching and optical flow datasets.
arXiv Detail & Related papers (2021-09-28T17:56:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.