This looks like what? Challenges and Future Research Directions for Part-Prototype Models
- URL: http://arxiv.org/abs/2502.09340v1
- Date: Thu, 13 Feb 2025 14:00:55 GMT
- Title: This looks like what? Challenges and Future Research Directions for Part-Prototype Models
- Authors: Khawla Elhadri, Tomasz Michalski, Adam Wróbel, Jörg Schlötterer, Bartosz Zieliński, Christin Seifert,
- Abstract summary: Part-Prototype Models (PPMs) make decisions by comparing an input image to a set of learned prototypes.
Despite their inherent interpretability, PPMS are not yet considered a valuable alternative to post-hoc models.
- Score: 2.1418711158896295
- License:
- Abstract: The growing interest in eXplainable Artificial Intelligence (XAI) has prompted research into models with built-in interpretability, the most prominent of which are part-prototype models. Part-Prototype Models (PPMs) make decisions by comparing an input image to a set of learned prototypes, providing human-understandable explanations in the form of ``this looks like that''. Despite their inherent interpretability, PPMS are not yet considered a valuable alternative to post-hoc models. In this survey, we investigate the reasons for this and provide directions for future research. We analyze papers from 2019 to 2024, and derive a taxonomy of the challenges that current PPMS face. Our analysis shows that the open challenges are quite diverse. The main concern is the quality and quantity of prototypes. Other concerns are the lack of generalization to a variety of tasks and contexts, and general methodological issues, including non-standardized evaluation. We provide ideas for future research in five broad directions: improving predictive performance, developing novel architectures grounded in theory, establishing frameworks for human-AI collaboration, aligning models with humans, and establishing metrics and benchmarks for evaluation. We hope that this survey will stimulate research and promote intrinsically interpretable models for application domains. Our list of surveyed papers is available at https://github.com/aix-group/ppm-survey.
Related papers
- Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - LLM Reading Tea Leaves: Automatically Evaluating Topic Models with Large Language Models [12.500091504010067]
We propose WALM (Word Agreement with Language Model), a new evaluation method for topic modeling.
With extensive experiments involving different types of topic models, WALM is shown to align with human judgment.
arXiv Detail & Related papers (2024-06-13T11:19:50Z) - Learning from models beyond fine-tuning [78.20895343699658]
Learn From Model (LFM) focuses on the research, modification, and design of foundation models (FM) based on the model interface.
The study of LFM techniques can be broadly categorized into five major areas: model tuning, model distillation, model reuse, meta learning and model editing.
This paper gives a comprehensive review of the current methods based on FM from the perspective of LFM.
arXiv Detail & Related papers (2023-10-12T10:20:36Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - Robust Visual Question Answering: Datasets, Methods, and Future
Challenges [23.59923999144776]
Visual question answering requires a system to provide an accurate natural language answer given an image and a natural language question.
Previous generic VQA methods often exhibit a tendency to memorize biases present in the training data rather than learning proper behaviors, such as grounding images before predicting answers.
Various datasets and debiasing methods have been proposed to evaluate and enhance the VQA robustness, respectively.
arXiv Detail & Related papers (2023-07-21T10:12:09Z) - Did the Models Understand Documents? Benchmarking Models for Language
Understanding in Document-Level Relation Extraction [2.4665182280122577]
Document-level relation extraction (DocRE) attracts more research interest recently.
While models achieve consistent performance gains in DocRE, their underlying decision rules are still understudied.
In this paper, we take the first step toward answering this question and then introduce a new perspective on comprehensively evaluating a model.
arXiv Detail & Related papers (2023-06-20T08:52:05Z) - Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey [66.18478838828231]
Multi-modal pre-trained big models have drawn more and more attention in recent years.
This paper introduces the background of multi-modal pre-training by reviewing the conventional deep, pre-training works in natural language process, computer vision, and speech.
Then, we introduce the task definition, key challenges, and advantages of multi-modal pre-training models (MM-PTMs), and discuss the MM-PTMs with a focus on data, objectives, network, and knowledge enhanced pre-training.
arXiv Detail & Related papers (2023-02-20T15:34:03Z) - Towards Human-Interpretable Prototypes for Visual Assessment of Image
Classification Models [9.577509224534323]
We need models which are interpretable-by-design built on a reasoning process similar to humans.
ProtoPNet claims to discover visually meaningful prototypes in an unsupervised way.
We find that these prototypes still have a long way ahead towards definite explanations.
arXiv Detail & Related papers (2022-11-22T11:01:22Z) - Evaluation Toolkit For Robustness Testing Of Automatic Essay Scoring
Systems [64.4896118325552]
We evaluate the current state-of-the-art AES models using a model adversarial evaluation scheme and associated metrics.
We find that AES models are highly overstable. Even heavy modifications(as much as 25%) with content unrelated to the topic of the questions do not decrease the score produced by the models.
arXiv Detail & Related papers (2020-07-14T03:49:43Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.