LLEXICORP: End-user Explainability of Convolutional Neural Networks
- URL: http://arxiv.org/abs/2511.02720v1
- Date: Tue, 04 Nov 2025 16:44:45 GMT
- Title: LLEXICORP: End-user Explainability of Convolutional Neural Networks
- Authors: Vojtěch Kůr, Adam Bajger, Adam Kukučka, Marek Hradil, Vít Musil, Tomáš Brázdil,
- Abstract summary: Concept relevance propagation (CRP) methods can backtrack predictions to these channels and find images that most activate these channels.<n>CRP are largely manual: experts must inspect activation images to name the discovered concepts and must synthesize verbose explanations from relevance maps.<n>Our approach automatically assigns descriptive names to concept prototypes and generates natural-language explanations.<n>Our findings suggest that integrating concept-based attribution methods with large language models can significantly lower the barrier to interpreting deep neural networks.
- Score: 4.417922173735815
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convolutional neural networks (CNNs) underpin many modern computer vision systems. With applications ranging from common to critical areas, a need to explain and understand the model and its decisions (XAI) emerged. Prior works suggest that in the top layers of CNNs, the individual channels can be attributed to classifying human-understandable concepts. Concept relevance propagation (CRP) methods can backtrack predictions to these channels and find images that most activate these channels. However, current CRP workflows are largely manual: experts must inspect activation images to name the discovered concepts and must synthesize verbose explanations from relevance maps, limiting the accessibility of the explanations and their scalability. To address these issues, we introduce Large Language model EXplaIns COncept Relevance Propagation (LLEXICORP), a modular pipeline that couples CRP with a multimodal large language model. Our approach automatically assigns descriptive names to concept prototypes and generates natural-language explanations that translate quantitative relevance distributions into intuitive narratives. To ensure faithfulness, we craft prompts that teach the language model the semantics of CRP through examples and enforce a separation between naming and explanation tasks. The resulting text can be tailored to different audiences, offering low-level technical descriptions for experts and high-level summaries for non-technical stakeholders. We qualitatively evaluate our method on various images from ImageNet on a VGG16 model. Our findings suggest that integrating concept-based attribution methods with large language models can significantly lower the barrier to interpreting deep neural networks, paving the way for more transparent AI systems.
Related papers
- Concept-Based Mechanistic Interpretability Using Structured Knowledge Graphs [3.429783703166407]
Our framework enables a global dissection of model behavior by analyzing how high-level semantic attributes emerge, interact, and propagate through internal model components.<n>A key innovation is our visualization platform that we named BAGEL, which presents these insights in a structured knowledge graph.<n>Our framework is model-agnostic, scalable, and contributes to a deeper understanding of how deep learning models generalize (or fail to) in the presence of dataset biases.
arXiv Detail & Related papers (2025-07-08T09:30:20Z) - Concept-Guided Interpretability via Neural Chunking [64.6429903327095]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract recurring chunks on a neural population level.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z) - VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow [57.96482272333649]
Feature visualization (FV) is a powerful tool to decode what information neurons are responding to.<n>We propose to guide FV through statistics of prototypical image features combined with measures of relevant network flow to generate images.<n>Our approach yields human-understandable visualizations that both qualitatively and quantitatively improve over state-of-the-art FVs.
arXiv Detail & Related papers (2025-03-28T13:08:18Z) - Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations [7.736445799116692]
Concept-based methods have become a popular choice for explaining deep neural networks post-hoc.<n>We devise a reinforcement learning-based preference optimization algorithm that fine-tunes a vision-language generative model.<n>We demonstrate our method's ability to efficiently and reliably articulate diverse concepts.
arXiv Detail & Related papers (2024-08-24T02:26:42Z) - Restyling Unsupervised Concept Based Interpretable Networks with Generative Models [14.604305230535026]
We propose a novel method that relies on mapping the concept features to the latent space of a pretrained generative model.<n>We quantitatively ascertain the efficacy of our method in terms of accuracy of the interpretable prediction network, fidelity of reconstruction, as well as faithfulness and consistency of learnt concepts.
arXiv Detail & Related papers (2024-07-01T14:39:41Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Manipulating Feature Visualizations with Gradient Slingshots [53.94925202421929]
Feature Visualization (FV) is a widely used technique for interpreting the concepts learned by Deep Neural Networks (DNNs)<n>We introduce a novel method, Gradient Slingshots, that enables manipulation of FV without modifying the model architecture or significantly degrading its performance.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - Finding Representative Interpretations on Convolutional Neural Networks [43.25913447473829]
We develop a novel unsupervised approach to produce a highly representative interpretation for a large number of similar images.
We formulate the problem of finding representative interpretations as a co-clustering problem, and convert it into a submodular cost submodular cover problem.
Our experiments demonstrate the excellent performance of our method.
arXiv Detail & Related papers (2021-08-13T20:17:30Z) - A Peek Into the Reasoning of Neural Networks: Interpreting with
Structural Visual Concepts [38.215184251799194]
We propose a framework (VRX) to interpret classification NNs with intuitive structural visual concepts.
By means of knowledge distillation, we show VRX can take a step towards mimicking the reasoning process of NNs.
arXiv Detail & Related papers (2021-05-01T15:47:42Z) - Probabilistic Graph Attention Network with Conditional Kernels for
Pixel-Wise Prediction [158.88345945211185]
We present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion.
We propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner.
arXiv Detail & Related papers (2021-01-08T04:14:29Z) - Neuro-Symbolic Representations for Video Captioning: A Case for
Leveraging Inductive Biases for Vision and Language [148.0843278195794]
We propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning.
Our approach uses a dictionary learning-based method of learning relations between videos and their paired text descriptions.
arXiv Detail & Related papers (2020-11-18T20:21:19Z) - Learning Deep Interleaved Networks with Asymmetric Co-Attention for
Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.
In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies.
Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.