Multi-criteria Rank-based Aggregation for Explainable AI
- URL: http://arxiv.org/abs/2505.24612v1
- Date: Fri, 30 May 2025 14:02:59 GMT
- Title: Multi-criteria Rank-based Aggregation for Explainable AI
- Authors: Sujoy Chatterjee, Everton Romanzini Colombo, Marcos Medeiros Raimundo,
- Abstract summary: This paper introduces a multi-criteria rank-based weighted aggregation method that balances multiple quality metrics simultaneously to produce an ensemble of explanation models.<n>Experiments on publicly available datasets demonstrate the robustness of the proposed model across these metrics.
- Score: 0.24578723416255746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Explainability is crucial for improving the transparency of black-box machine learning models. With the advancement of explanation methods such as LIME and SHAP, various XAI performance metrics have been developed to evaluate the quality of explanations. However, different explainers can provide contrasting explanations for the same prediction, introducing trade-offs across conflicting quality metrics. Although available aggregation approaches improve robustness, reducing explanations' variability, very limited research employed a multi-criteria decision-making approach. To address this gap, this paper introduces a multi-criteria rank-based weighted aggregation method that balances multiple quality metrics simultaneously to produce an ensemble of explanation models. Furthermore, we propose rank-based versions of existing XAI metrics (complexity, faithfulness and stability) to better evaluate ranked feature importance explanations. Extensive experiments on publicly available datasets demonstrate the robustness of the proposed model across these metrics. Comparative analyses of various multi-criteria decision-making and rank aggregation algorithms showed that TOPSIS and WSUM are the best candidates for this use case.
Related papers
- EVA-MILP: Towards Standardized Evaluation of MILP Instance Generation [13.49043811341421]
Mixed-Integer Linear Programming (MILP) is fundamental to solving complex decision-making problems.<n>The proliferation of MILP instance generation methods, driven by machine learning's demand for diverse datasets, has significantly outpaced standardized evaluation techniques.<n>This paper introduces a comprehensive benchmark framework designed for the systematic and objective evaluation of MILP instance generation methods.
arXiv Detail & Related papers (2025-05-30T16:42:15Z) - Ranked from Within: Ranking Large Multimodal Models for Visual Question Answering Without Labels [64.94853276821992]
Large multimodal models (LMMs) are increasingly deployed across diverse applications.<n>Traditional evaluation methods are largely dataset-centric, relying on fixed, labeled datasets and supervised metrics.<n>We explore unsupervised model ranking for LMMs by leveraging their uncertainty signals, such as softmax probabilities.
arXiv Detail & Related papers (2024-12-09T13:05:43Z) - BEExAI: Benchmark to Evaluate Explainable AI [0.9176056742068812]
We propose BEExAI, a benchmark tool that allows large-scale comparison of different post-hoc XAI methods.
We argue that the need for a reliable way of measuring the quality and correctness of explanations is becoming critical.
arXiv Detail & Related papers (2024-07-29T11:21:17Z) - Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode.
We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z) - EXACT: Towards a platform for empirically benchmarking Machine Learning model explanation methods [1.6383837447674294]
This paper brings together various benchmark datasets and novel performance metrics in an initial benchmarking platform.
Our datasets incorporate ground truth explanations for class-conditional features.
This platform assesses the performance of post-hoc XAI methods in the quality of the explanations they produce.
arXiv Detail & Related papers (2024-05-20T14:16:06Z) - Statistical Comparisons of Classifiers by Generalized Stochastic
Dominance [0.0]
There is still no consensus on how to compare classifiers over multiple data sets with respect to several criteria.
In this paper, we add a fresh view to the vivid debate by adopting recent developments in decision theory.
We show that our framework ranks classifiers by a generalized concept of dominance, which powerfully circumvents the cumbersome, and often even self-contradictory, reliance on aggregates.
arXiv Detail & Related papers (2022-09-05T09:28:15Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - MACE: An Efficient Model-Agnostic Framework for Counterfactual
Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE)
In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity.
Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z) - A Meta Survey of Quality Evaluation Criteria in Explanation Methods [0.5801044612920815]
Explanation methods and their evaluation have become a significant issue in explainable artificial intelligence (XAI)
Since the most accurate AI models are opaque with low transparency and comprehensibility, explanations are essential for bias detection and control of uncertainty.
There are a plethora of criteria to choose from when evaluating explanation method quality.
arXiv Detail & Related papers (2022-03-25T22:24:21Z) - Beyond Explaining: Opportunities and Challenges of XAI-Based Model
Improvement [75.00655434905417]
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex machine learning (ML) models.
This paper offers a comprehensive overview over techniques that apply XAI practically for improving various properties of ML models.
We show empirically through experiments on toy and realistic settings how explanations can help improve properties such as model generalization ability or reasoning.
arXiv Detail & Related papers (2022-03-15T15:44:28Z) - Partial Order in Chaos: Consensus on Feature Attributions in the
Rashomon Set [50.67431815647126]
Post-hoc global/local feature attribution methods are being progressively employed to understand machine learning models.
We show that partial orders of local/global feature importance arise from this methodology.
We show that every relation among features present in these partial orders also holds in the rankings provided by existing approaches.
arXiv Detail & Related papers (2021-10-26T02:53:14Z) - Interpretable Multi-dataset Evaluation for Named Entity Recognition [110.64368106131062]
We present a general methodology for interpretable evaluation for the named entity recognition (NER) task.
The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them.
By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area.
arXiv Detail & Related papers (2020-11-13T10:53:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.