XMENTOR: A Rank-Aware Aggregation Approach for Human-Centered Explainable AI in Just-in-Time Software Defect Prediction
- URL: http://arxiv.org/abs/2602.22403v1
- Date: Wed, 25 Feb 2026 20:54:49 GMT
- Title: XMENTOR: A Rank-Aware Aggregation Approach for Human-Centered Explainable AI in Just-in-Time Software Defect Prediction
- Authors: Saumendu Roy, Banani Roy, Chanchal Roy, Richard Bassey,
- Abstract summary: We introduce XMENTOR, a human-centered, rank-aware aggregation method implemented as a VS Code plugin.<n>XMENTOR unifies multiple post-hoc explanations into a single, coherent view by applying adaptive thresholding, rank and sign agreement.<n>Our findings show how combining explanations and embedding them into developer can enhance interpretability, usability, and trust.
- Score: 5.646457568088472
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Machine learning (ML)-based defect prediction models can improve software quality. However, their opaque reasoning creates an HCI challenge because developers struggle to trust models they cannot interpret. Explainable AI (XAI) methods such as LIME, SHAP, and BreakDown aim to provide transparency, but when used together, they often produce conflicting explanations that increase confusion, frustration, and cognitive load. To address this usability challenge, we introduce XMENTOR, a human-centered, rank-aware aggregation method implemented as a VS Code plugin. XMENTOR unifies multiple post-hoc explanations into a single, coherent view by applying adaptive thresholding, rank and sign agreement, and fallback strategies to preserve clarity without overwhelming users. In a user study, nearly 90% of the participants preferred aggregated explanations, citing reduced confusion and stronger support for daily tasks of debugging and review of defects. Our findings show how combining explanations and embedding them into developer workflows can enhance interpretability, usability, and trust.
Related papers
- PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding [85.22047087898311]
We introduce Polarity-Prompt Contrastive Decoding (PromptCD), a test-time behavior control method that generalizes contrastive decoding to broader enhancement settings.<n>PromptCD constructs paired positive and negative guiding prompts for a target behavior and contrasts model responses to reinforce desirable outcomes.<n>Experiments on the "3H" alignment objectives demonstrate consistent and substantial improvements, indicating that post-trained models can achieve meaningful self-enhancement purely at test time.
arXiv Detail & Related papers (2026-02-24T08:56:52Z) - VEXA: Evidence-Grounded and Persona-Adaptive Explanations for Scam Risk Sensemaking [9.22587207148122]
Online scams across email, short message services, and social media increasingly challenge everyday risk assessment.<n>We propose VEXA, an evidence-grounded and persona-adaptive framework for generating learner-facing scam explanations.
arXiv Detail & Related papers (2026-02-04T21:16:24Z) - Explaining AI Without Code: A User Study on Explainable AI [1.7966001353008778]
We present a human-centered XAI module in DashAI, an open-source no-code ML platform.<n>A user study evaluated usability and the impact of explanations on novices and experts.
arXiv Detail & Related papers (2025-12-28T15:44:43Z) - CoCoNUTS: Concentrating on Content while Neglecting Uninformative Textual Styles for AI-Generated Peer Review Detection [60.52240468810558]
We introduce CoCoNUTS, a content-oriented benchmark built upon a fine-grained dataset of AI-generated peer reviews.<n>We also develop CoCoDet, an AI review detector via a multi-task learning framework, to achieve more accurate and robust detection of AI involvement in review content.
arXiv Detail & Related papers (2025-08-28T06:03:11Z) - Interactive Reasoning: Visualizing and Controlling Chain-of-Thought Reasoning in Large Language Models [54.85405423240165]
We introduce Interactive Reasoning, an interaction design that visualizes chain-of-thought outputs as a hierarchy of topics.<n>We implement interactive reasoning in Hippo, a prototype for AI-assisted decision making in the face of uncertain trade-offs.
arXiv Detail & Related papers (2025-06-30T10:00:43Z) - PixelThink: Towards Efficient Chain-of-Pixel Reasoning [70.32510083790069]
PixelThink is a simple yet effective scheme that integrates externally estimated task difficulty and internally measured model uncertainty.<n>It learns to compress reasoning length in accordance with scene complexity and predictive confidence.<n> Experimental results demonstrate that the proposed approach improves both reasoning efficiency and overall segmentation performance.
arXiv Detail & Related papers (2025-05-29T17:55:49Z) - Don't Just Translate, Agitate: Using Large Language Models as Devil's Advocates for AI Explanations [1.6855625805565164]
Large Language Models (LLMs) are used to translate outputs from explainability techniques, like feature-attribution weights, into a natural language explanation.<n>Recent findings suggest translating into human-like explanations does not necessarily enhance user understanding and may instead lead to overreliance on AI systems.
arXiv Detail & Related papers (2025-04-16T18:45:18Z) - Explaining Explainability: Towards Deeper Actionable Insights into Deep
Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level.
We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z) - Can Explainable AI Explain Unfairness? A Framework for Evaluating
Explainable AI [3.4823710414760516]
Despite XAI tools' strength in translating model behavior, critiques have raised concerns about the impact of XAI tools as a tool for fairwashing
We created a framework for evaluating explainable AI tools with respect to their capabilities for detecting and addressing issues of bias and fairness.
We found that despite their capabilities in simplifying and explaining model behavior, many prominent XAI tools lack features that could be critical in detecting bias.
arXiv Detail & Related papers (2021-06-14T15:14:03Z) - DA-DGCEx: Ensuring Validity of Deep Guided Counterfactual Explanations
With Distribution-Aware Autoencoder Loss [0.0]
Deep Learning models are often seen as black boxes due to their lack of interpretability.
This paper presents Distribution Aware Deep Guided Counterfactual Explanations (DA-DGCEx)
It adds a term to the DGCEx cost function that penalizes out of distribution counterfactual instances.
arXiv Detail & Related papers (2021-04-19T05:44:18Z) - Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL)
In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.