A Survey on Interpretability in Visual Recognition
- URL: http://arxiv.org/abs/2507.11099v1
- Date: Tue, 15 Jul 2025 08:45:54 GMT
- Title: A Survey on Interpretability in Visual Recognition
- Authors: Qiyang Wan, Chengzhi Gao, Ruiping Wang, Xilin Chen,
- Abstract summary: This paper systematically reviews existing research on the interpretability of visual recognition models.<n>We propose a taxonomy of methods from a human-centered perspective.<n>We aim to organize existing research in this domain and inspire future investigations into the interpretability of visual recognition models.
- Score: 28.577223694381452
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In recent years, visual recognition methods have advanced significantly, finding applications across diverse fields. While researchers seek to understand the mechanisms behind the success of these models, there is also a growing impetus to deploy them in critical areas like autonomous driving and medical diagnostics to better diagnose failures, which promotes the development of interpretability research. This paper systematically reviews existing research on the interpretability of visual recognition models and proposes a taxonomy of methods from a human-centered perspective. The proposed taxonomy categorizes interpretable recognition methods based on Intent, Object, Presentation, and Methodology, thereby establishing a systematic and coherent set of grouping criteria for these XAI methods. Additionally, we summarize the requirements for evaluation metrics and explore new opportunities enabled by recent technologies, such as large multimodal models. We aim to organize existing research in this domain and inspire future investigations into the interpretability of visual recognition models.
Related papers
- Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z) - Biomedical Foundation Model: A Survey [84.26268124754792]
Foundation models are large-scale pre-trained models that learn from extensive unlabeled datasets.<n>These models can be adapted to various applications such as question answering and visual understanding.<n>This survey explores the potential of foundation models across diverse domains within biomedical fields.
arXiv Detail & Related papers (2025-03-03T22:42:00Z) - Methods and Trends in Detecting Generated Images: A Comprehensive Review [0.552480439325792]
Generative Adversarial Networks (GANs), Diffusion Models, and Variational Autoencoders (VAEs) have enabled the synthesis of high-quality multimedia data.<n>These advancements have also raised significant concerns regarding adversarial attacks, unethical usage, and societal harm.
arXiv Detail & Related papers (2025-02-21T03:16:18Z) - A Survey of Models for Cognitive Diagnosis: New Developments and Future Directions [66.40362209055023]
This paper aims to provide a survey of current models for cognitive diagnosis, with more attention on new developments using machine learning-based methods.
By comparing the model structures, parameter estimation algorithms, model evaluation methods and applications, we provide a relatively comprehensive review of the recent trends in cognitive diagnosis models.
arXiv Detail & Related papers (2024-07-07T18:02:00Z) - Out-of-distribution Detection in Medical Image Analysis: A survey [12.778646136644399]
Computer-aided diagnostics has benefited from the development of deep learning-based computer vision techniques.
Traditional supervised deep learning methods assume that the test sample is drawn from the identical distribution as the training data.
It is possible to encounter out-of-distribution samples in real-world clinical scenarios, which may cause silent failure in deep learning-based medical image analysis tasks.
arXiv Detail & Related papers (2024-04-28T18:51:32Z) - Recognizing Identities From Human Skeletons: A Survey on 3D Skeleton Based Person Re-Identification [60.939250172443586]
Person re-identification via 3D skeletons is an important emerging research area that attracts increasing attention within the pattern recognition community.<n>We provide a comprehensive review and analysis of recent SRID advances.<n>A thorough evaluation of state-of-the-art SRID methods is conducted over various types of benchmarks and protocols to compare their effectiveness and efficiency.
arXiv Detail & Related papers (2024-01-27T04:52:24Z) - A Survey on Interpretable Cross-modal Reasoning [64.37362731950843]
Cross-modal reasoning (CMR) has emerged as a pivotal area with applications spanning from multimedia analysis to healthcare diagnostics.
This survey delves into the realm of interpretable cross-modal reasoning (I-CMR)
This survey presents a comprehensive overview of the typical methods with a three-level taxonomy for I-CMR.
arXiv Detail & Related papers (2023-09-05T05:06:48Z) - A Survey of Explainable AI in Deep Visual Modeling: Methods and Metrics [24.86176236641865]
We present the first survey in Explainable AI that focuses on the methods and metrics for interpreting deep visual models.
Covering the landmark contributions along the state-of-the-art, we not only provide a taxonomic organization of the existing techniques, but also excavate a range of evaluation metrics.
arXiv Detail & Related papers (2023-01-31T06:49:42Z) - TorchEsegeta: Framework for Interpretability and Explainability of
Image-based Deep Learning Models [0.0]
Clinicians are often sceptical about applying automatic image processing approaches, especially deep learning based methods, in practice.
This paper presents approaches that help to interpret and explain the results of deep learning algorithms by depicting the anatomical areas which influence the decision of the algorithm most.
Research presents a unified framework, TorchEsegeta, for applying various interpretability and explainability techniques for deep learning models.
arXiv Detail & Related papers (2021-10-16T01:00:15Z) - Deep Gait Recognition: A Survey [15.47582611826366]
Gait recognition is an appealing biometric modality which aims to identify individuals based on the way they walk.
Deep learning has reshaped the research landscape in this area since 2015 through the ability to automatically learn discriminative representations.
We present a comprehensive overview of breakthroughs and recent developments in gait recognition with deep learning.
arXiv Detail & Related papers (2021-02-18T18:49:28Z) - Recent Progress in Appearance-based Action Recognition [73.6405863243707]
Action recognition is a task to identify various human actions in a video.
Recent appearance-based methods have achieved promising progress towards accurate action recognition.
arXiv Detail & Related papers (2020-11-25T10:18:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.