Related papers: A Comprehensive Survey on the Risks and Limitations of Concept-based Models

A Comprehensive Survey on the Risks and Limitations of Concept-based Models

URL: http://arxiv.org/abs/2506.04237v1
Date: Sun, 25 May 2025 03:53:26 GMT
Title: A Comprehensive Survey on the Risks and Limitations of Concept-based Models
Authors: Sanchit Sinha, Aidong Zhang,
Abstract summary: Concept-based Models are inherently explainable networks that improve upon standard Deep Neural Networks.<n>These models are highly successful in critical applications like medical diagnosis and financial risk prediction.<n>However, recent research has uncovered significant limitations in the structure of such networks.
Score: 33.641361996627175
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Concept-based Models are a class of inherently explainable networks that improve upon standard Deep Neural Networks by providing a rationale behind their predictions using human-understandable `concepts'. With these models being highly successful in critical applications like medical diagnosis and financial risk prediction, there is a natural push toward their wider adoption in sensitive domains to instill greater trust among diverse stakeholders. However, recent research has uncovered significant limitations in the structure of such networks, their training procedure, underlying assumptions, and their susceptibility to adversarial vulnerabilities. In particular, issues such as concept leakage, entangled representations, and limited robustness to perturbations pose challenges to their reliability and generalization. Additionally, the effectiveness of human interventions in these models remains an open question, raising concerns about their real-world applicability. In this paper, we provide a comprehensive survey on the risks and limitations associated with Concept-based Models. In particular, we focus on aggregating commonly encountered challenges and the architecture choices mitigating these challenges for Supervised and Unsupervised paradigms. We also examine recent advances in improving their reliability and discuss open problems and promising avenues of future research in this domain.

Related papers

Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z)
Robustness in Large Language Models: A Survey of Mitigation Strategies and Evaluation Metrics [0.7481505949203433]
Large Language Models (LLMs) have emerged as a promising cornerstone for the development of natural language processing (NLP) and artificial intelligence (AI)<n>This survey provides a comprehensive overview of current studies in this area.
arXiv Detail & Related papers (2025-05-24T11:50:52Z)
Intrinsic Barriers to Explaining Deep Foundation Models [17.952353851860742]
Deep Foundation Models (DFMs) offer unprecedented capabilities but their increasing complexity presents profound challenges to understanding their internal workings.<n>This paper delves into this critical question by examining the fundamental characteristics of DFMs and scrutinizing the limitations encountered by current explainability methods.
arXiv Detail & Related papers (2025-04-21T21:19:23Z)
Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey [92.36487127683053]
Retrieval-Augmented Generation (RAG) is an advanced technique designed to address the challenges of Artificial Intelligence-Generated Content (AIGC)<n>RAG provides reliable and up-to-date external knowledge, reduces hallucinations, and ensures relevant context across a wide range of tasks.<n>Despite RAG's success and potential, recent studies have shown that the RAG paradigm also introduces new risks, including privacy concerns, adversarial attacks, and accountability issues.
arXiv Detail & Related papers (2025-02-08T06:50:47Z)
On the Societal Impact of Open Foundation Models [93.67389739906561]
We focus on open foundation models, defined here as those with broadly available model weights. We identify five distinctive properties of open foundation models that lead to both their benefits and risks.
arXiv Detail & Related papers (2024-02-27T16:49:53Z)
Towards Improving Robustness Against Common Corruptions using Mixture of Class Specific Experts [10.27974860479791]
This paper introduces a novel paradigm known as the Mixture of Class-Specific Expert Architecture. The proposed architecture aims to mitigate vulnerabilities associated with common neural network structures.
arXiv Detail & Related papers (2023-11-16T20:09:47Z)
A Survey on Transferability of Adversarial Examples across Deep Neural Networks [53.04734042366312]
adversarial examples can manipulate machine learning models into making erroneous predictions. The transferability of adversarial examples enables black-box attacks which circumvent the need for detailed knowledge of the target model. This survey explores the landscape of the adversarial transferability of adversarial examples.
arXiv Detail & Related papers (2023-10-26T17:45:26Z)
Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks [142.67349734180445]
Existing algorithms that provide risk-awareness to deep neural networks are complex and ad-hoc. Here we present capsa, a framework for extending models with risk-awareness.
arXiv Detail & Related papers (2023-08-01T02:07:47Z)
Understanding and Enhancing Robustness of Concept-based Models [41.20004311158688]
We study robustness of concept-based models to adversarial perturbations. In this paper, we first propose and analyze different malicious attacks to evaluate the security vulnerability of concept based models. We then propose a potential general adversarial training-based defense mechanism to increase robustness of these systems to the proposed malicious attacks.
arXiv Detail & Related papers (2022-11-29T10:43:51Z)
Fairness Increases Adversarial Vulnerability [50.90773979394264]
This paper shows the existence of a dichotomy between fairness and robustness, and analyzes when achieving fairness decreases the model robustness to adversarial samples. Experiments on non-linear models and different architectures validate the theoretical findings in multiple vision domains. The paper proposes a simple, yet effective, solution to construct models achieving good tradeoffs between fairness and robustness.
arXiv Detail & Related papers (2022-11-21T19:55:35Z)
A Survey of Uncertainty in Deep Neural Networks [39.68313590688467]
It is intended to give anyone interested in uncertainty estimation in neural networks a broad overview and introduction. A comprehensive introduction to the most crucial sources of uncertainty is given and their separation into reducible model uncertainty and not reducible data uncertainty is presented. For a practical application, we discuss different measures of uncertainty, approaches for the calibration of neural networks and give an overview of existing baselines and implementations.
arXiv Detail & Related papers (2021-07-07T16:39:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.