Explainability for Vision Foundation Models: A Survey
- URL: http://arxiv.org/abs/2501.12203v1
- Date: Tue, 21 Jan 2025 15:18:55 GMT
- Title: Explainability for Vision Foundation Models: A Survey
- Authors: Rémi Kazmierczak, Eloïse Berthier, Goran Frehse, Gianni Franchi,
- Abstract summary: Foundation models occupy an ambiguous position in the explainability domain.
Foundation models are characterized by their extensive generalization capabilities and emergent uses.
We discuss the challenges faced by current research in integrating XAI within foundation models.
- Score: 3.570403495760109
- License:
- Abstract: As artificial intelligence systems become increasingly integrated into daily life, the field of explainability has gained significant attention. This trend is particularly driven by the complexity of modern AI models and their decision-making processes. The advent of foundation models, characterized by their extensive generalization capabilities and emergent uses, has further complicated this landscape. Foundation models occupy an ambiguous position in the explainability domain: their complexity makes them inherently challenging to interpret, yet they are increasingly leveraged as tools to construct explainable models. In this survey, we explore the intersection of foundation models and eXplainable AI (XAI) in the vision domain. We begin by compiling a comprehensive corpus of papers that bridge these fields. Next, we categorize these works based on their architectural characteristics. We then discuss the challenges faced by current research in integrating XAI within foundation models. Furthermore, we review common evaluation methodologies for these combined approaches. Finally, we present key observations and insights from our survey, offering directions for future research in this rapidly evolving field.
Related papers
- Brain-inspired Artificial Intelligence: A Comprehensive Review [15.964784631512414]
Review explores the diverse design inspirations that have shaped modern AI models, i.e., brain-inspired artificial intelligence (BIAI)
We present a classification framework that categorizes BIAI approaches into physical structure-inspired and human behavior-inspired models.
We examine the real-world applications where different BIAI models excel, highlighting their practical benefits and deployment challenges.
arXiv Detail & Related papers (2024-08-27T06:49:50Z) - Vision Foundation Models in Remote Sensing: A Survey [6.036426846159163]
Foundation models are large-scale, pre-trained AI models capable of performing a wide array of tasks with unprecedented accuracy and efficiency.
This survey aims to serve as a resource for researchers and practitioners by providing a panorama of advances and promising pathways for continued development and application of foundation models in remote sensing.
arXiv Detail & Related papers (2024-08-06T22:39:34Z) - Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities [5.22475289121031]
Multimodal models are expected to be a critical component to future advances in artificial intelligence.
This work provides a fresh perspective on generalist multimodal models via a novel architecture and training configuration specific taxonomy.
arXiv Detail & Related papers (2024-06-08T15:30:46Z) - Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond [101.15395503285804]
General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI)
In this survey, we embark on a comprehensive exploration of the latest advancements in world models.
We examine challenges and limitations of world models, and discuss their potential future directions.
arXiv Detail & Related papers (2024-05-06T14:37:07Z) - On the Challenges and Opportunities in Generative AI [135.2754367149689]
We argue that current large-scale generative AI models do not sufficiently address several fundamental issues that hinder their widespread adoption across domains.
In this work, we aim to identify key unresolved challenges in modern generative AI paradigms that should be tackled to further enhance their capabilities, versatility, and reliability.
arXiv Detail & Related papers (2024-02-28T15:19:33Z) - A Survey of Reasoning with Foundation Models [235.7288855108172]
Reasoning plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation.
We introduce seminal foundation models proposed or adaptable for reasoning.
We then delve into the potential future directions behind the emergence of reasoning abilities within foundation models.
arXiv Detail & Related papers (2023-12-17T15:16:13Z) - Foundation Models Meet Visualizations: Challenges and Opportunities [23.01218856618978]
This paper divides visualizations for foundation models (VIS4FM) and foundation models for visualizations (FM4VIS)
In VIS4FM, we explore the primary role of visualizations in understanding, refining, and evaluating these intricate models.
In FM4VIS, we highlight how foundation models can be utilized to advance the visualization field itself.
arXiv Detail & Related papers (2023-10-09T14:57:05Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Causal Reasoning Meets Visual Representation Learning: A Prospective
Study [117.08431221482638]
Lack of interpretability, robustness, and out-of-distribution generalization are becoming the challenges of the existing visual models.
Inspired by the strong inference ability of human-level agents, recent years have witnessed great effort in developing causal reasoning paradigms.
This paper aims to provide a comprehensive overview of this emerging field, attract attention, encourage discussions, bring to the forefront the urgency of developing novel causal reasoning methods.
arXiv Detail & Related papers (2022-04-26T02:22:28Z) - WenLan 2.0: Make AI Imagine via a Multimodal Foundation Model [74.4875156387271]
We develop a novel foundation model pre-trained with huge multimodal (visual and textual) data.
We show that state-of-the-art results can be obtained on a wide range of downstream tasks.
arXiv Detail & Related papers (2021-10-27T12:25:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.