ExplainBench: A Benchmark Framework for Local Model Explanations in Fairness-Critical Applications
- URL: http://arxiv.org/abs/2506.06330v1
- Date: Sat, 31 May 2025 01:12:23 GMT
- Title: ExplainBench: A Benchmark Framework for Local Model Explanations in Fairness-Critical Applications
- Authors: James Afful,
- Abstract summary: We introduce ExplainBench, an open-source benchmarking suite for systematic evaluation of local model explanations.<n>The framework includes a Streamlit-based graphical interface for interactive exploration and is packaged as a Python module.<n>We demonstrate ExplainBench on datasets commonly used in fairness research, such as COMPAS, UCI Adult Income, and LendingClub.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As machine learning systems are increasingly deployed in high-stakes domains such as criminal justice, finance, and healthcare, the demand for interpretable and trustworthy models has intensified. Despite the proliferation of local explanation techniques, including SHAP, LIME, and counterfactual methods, there exists no standardized, reproducible framework for their comparative evaluation, particularly in fairness-sensitive settings. We introduce ExplainBench, an open-source benchmarking suite for systematic evaluation of local model explanations across ethically consequential datasets. ExplainBench provides unified wrappers for popular explanation algorithms, integrates end-to-end pipelines for model training and explanation generation, and supports evaluation via fidelity, sparsity, and robustness metrics. The framework includes a Streamlit-based graphical interface for interactive exploration and is packaged as a Python module for seamless integration into research workflows. We demonstrate ExplainBench on datasets commonly used in fairness research, such as COMPAS, UCI Adult Income, and LendingClub, and showcase how different explanation methods behave under a shared experimental protocol. By enabling reproducible, comparative analysis of local explanations, ExplainBench advances the methodological foundations of interpretable machine learning and facilitates accountability in real-world AI systems.
Related papers
- Easy Data Unlearning Bench [53.1304932656586]
We introduce a unified and benchmarking suite that simplifies the evaluation of unlearning algorithms.<n>By standardizing setup and metrics, it enables reproducible, scalable, and fair comparison across unlearning methods.
arXiv Detail & Related papers (2026-02-18T12:20:32Z) - An Information-Theoretic Framework for Comparing Voice and Text Explainability [0.0]
This paper introduces an information theoretic framework for analyzing how explanation modality affects user comprehension and trust calibration in AI systems.<n>The proposed model treats explanation delivery as a communication channel between model and user, characterized by metrics for information retention, comprehension efficiency (CE), and trust calibration error (T CE)<n>Results demonstrate that text explanations achieve higher comprehension efficiency, while voice explanations yield improved trust calibration, with analogy based delivery achieving the best overall trade off.
arXiv Detail & Related papers (2026-02-06T20:28:46Z) - VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models [55.39064621869925]
OpenAI o1 and DeepSeek-R1 have achieved remarkable performance in the domain of reasoning.<n>A key component of their training is the incorporation of verifiable rewards within reinforcement learning.<n>Existing reward benchmarks do not evaluate reference-based reward systems.
arXiv Detail & Related papers (2025-05-21T17:54:43Z) - CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.<n>Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)<n>We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z) - LaPLACE: Probabilistic Local Model-Agnostic Causal Explanations [1.0370398945228227]
We introduce LaPLACE-explainer, designed to provide probabilistic cause-and-effect explanations for machine learning models.
The LaPLACE-Explainer component leverages the concept of a Markov blanket to establish statistical boundaries between relevant and non-relevant features.
Our approach offers causal explanations and outperforms LIME and SHAP in terms of local accuracy and consistency of explained features.
arXiv Detail & Related papers (2023-10-01T04:09:59Z) - FFB: A Fair Fairness Benchmark for In-Processing Group Fairness Methods [84.1077756698332]
This paper introduces the Fair Fairness Benchmark (textsfFFB), a benchmarking framework for in-processing group fairness methods.
We provide a comprehensive analysis of state-of-the-art methods to ensure different notions of group fairness.
arXiv Detail & Related papers (2023-06-15T19:51:28Z) - GLOBE-CE: A Translation-Based Approach for Global Counterfactual
Explanations [10.276136171459731]
Global & Efficient Counterfactual Explanations (GLOBE-CE) is a flexible framework that tackles the reliability and scalability issues associated with current state-of-the-art.
We provide a unique mathematical analysis of categorical feature translations, utilising it in our method.
Experimental evaluation with publicly available datasets and user studies demonstrate that GLOBE-CE performs significantly better than the current state-of-the-art.
arXiv Detail & Related papers (2023-05-26T15:26:59Z) - Syntactically Robust Training on Partially-Observed Data for Open
Information Extraction [25.59133746149343]
Open Information Extraction models have shown promising results with sufficient supervision.
We propose a syntactically robust training framework that enables models to be trained on a syntactic-abundant distribution.
arXiv Detail & Related papers (2023-01-17T12:39:13Z) - CausalBench: A Large-scale Benchmark for Network Inference from
Single-cell Perturbation Data [61.088705993848606]
We introduce CausalBench, a benchmark suite for evaluating causal inference methods on real-world interventional data.
CaulBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics.
arXiv Detail & Related papers (2022-10-31T13:04:07Z) - Change Detection for Local Explainability in Evolving Data Streams [72.4816340552763]
Local feature attribution methods have become a popular technique for post-hoc and model-agnostic explanations.
It is often unclear how local attributions behave in realistic, constantly evolving settings such as streaming and online applications.
We present CDLEEDS, a flexible and model-agnostic framework for detecting local change and concept drift.
arXiv Detail & Related papers (2022-09-06T18:38:34Z) - AcME -- Accelerated Model-agnostic Explanations: Fast Whitening of the
Machine-Learning Black Box [1.7534486934148554]
interpretability approaches should provide actionable insights without making the users wait.
We propose Accelerated Model-agnostic Explanations (AcME), an interpretability approach that quickly provides feature importance scores both at the global and the local level.
AcME computes feature ranking, but it also provides a what-if analysis tool to assess how changes in features values would affect model predictions.
arXiv Detail & Related papers (2021-12-23T15:18:13Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z) - Fairness by Explicability and Adversarial SHAP Learning [0.0]
We propose a new definition of fairness that emphasises the role of an external auditor and model explicability.
We develop a framework for mitigating model bias using regularizations constructed from the SHAP values of an adversarial surrogate model.
We demonstrate our approaches using gradient and adaptive boosting on: a synthetic dataset, the UCI Adult (Census) dataset and a real-world credit scoring dataset.
arXiv Detail & Related papers (2020-03-11T14:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.