Related papers: VEXA: Evidence-Grounded and Persona-Adaptive Explanations for Scam Risk Sensemaking

VEXA: Evidence-Grounded and Persona-Adaptive Explanations for Scam Risk Sensemaking

URL: http://arxiv.org/abs/2602.05056v1
Date: Wed, 04 Feb 2026 21:16:24 GMT
Title: VEXA: Evidence-Grounded and Persona-Adaptive Explanations for Scam Risk Sensemaking
Authors: Heajun An, Connor Ng, Sandesh Sharma Dulal, Junghwan Kim, Jin-Hee Cho,
Abstract summary: Online scams across email, short message services, and social media increasingly challenge everyday risk assessment.<n>We propose VEXA, an evidence-grounded and persona-adaptive framework for generating learner-facing scam explanations.
Score: 9.22587207148122
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Online scams across email, short message services, and social media increasingly challenge everyday risk assessment, particularly as generative AI enables more fluent and context-aware deception. Although transformer-based detectors achieve strong predictive performance, their explanations are often opaque to non-experts or misaligned with model decisions. We propose VEXA, an evidence-grounded and persona-adaptive framework for generating learner-facing scam explanations by integrating GradientSHAP-based attribution with theory-informed vulnerability personas. Evaluation across multi-channel datasets shows that grounding explanations in detector-derived evidence improves semantic reliability without increasing linguistic complexity, while persona conditioning introduces interpretable stylistic variation without disrupting evidential alignment. These results reveal a key design insight: evidential grounding governs semantic correctness, whereas persona-based adaptation operates at the level of presentation under constraints of faithfulness. Together, VEXA demonstrates the feasibility of persona-adaptive, evidence-grounded explanations and provides design guidance for trustworthy, learner-facing security explanations in non-formal contexts.

Related papers

XMENTOR: A Rank-Aware Aggregation Approach for Human-Centered Explainable AI in Just-in-Time Software Defect Prediction [5.646457568088472]
We introduce XMENTOR, a human-centered, rank-aware aggregation method implemented as a VS Code plugin.<n>XMENTOR unifies multiple post-hoc explanations into a single, coherent view by applying adaptive thresholding, rank and sign agreement.<n>Our findings show how combining explanations and embedding them into developer can enhance interpretability, usability, and trust.
arXiv Detail & Related papers (2026-02-25T20:54:49Z)
From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models [77.04403907729738]
This survey charts the evolution of uncertainty from a passive diagnostic metric to an active control signal guiding real-time model behavior.<n>We demonstrate how uncertainty is leveraged as an active control signal across three frontiers.<n>This survey argues that mastering the new trend of uncertainty is essential for building the next generation of scalable, reliable, and trustworthy AI.
arXiv Detail & Related papers (2026-01-22T06:21:31Z)
REFLEX: Self-Refining Explainable Fact-Checking via Disentangling Truth into Style and Substance [14.932352020762991]
We propose REason-guided Fact-checking with Latent EXplanations REFLEX paradigm.<n>It is a plug-and-play, self-refining paradigm that leverages the internal knowledge in backbone model to improve both verdict accuracy and explanation quality.<n>With only 465 self-refined training samples, RELFEX achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-11-25T12:06:23Z)
Towards Transparent Stance Detection: A Zero-Shot Approach Using Implicit and Explicit Interpretability [12.794773087413256]
Zero-Shot Stance Detection (ZSSD) identifies the attitude of the post toward unseen targets.<n>IRIS considers stance detection as an information retrieval ranking task.<n>explicit rationales based on communicative features help decode the emotional and cognitive dimensions of stance.
arXiv Detail & Related papers (2025-11-05T16:54:10Z)
Exploring Semantic-constrained Adversarial Example with Instruction Uncertainty Reduction [51.50282796099369]
This paper develops a multi-dimensional instruction uncertainty reduction framework to generate semantically constrained adversarial examples.<n>By predicting the language-guided sampling process, the optimization process will be stabilized by the designed ResAdv-DDIM sampler.<n>We realize the reference-free generation of semantically constrained 3D adversarial examples for the first time.
arXiv Detail & Related papers (2025-10-27T04:02:52Z)
Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial Alignment [48.73836179661632]
Confidence-guided adversarial alignment adaptively suppresses attack-specific artifacts without erasing discriminative cues.<n>IB-CAAN consistently outperforms baseline and state-of-the-art performance on many benchmarks.
arXiv Detail & Related papers (2025-09-28T03:48:49Z)
FIRE: Faithful Interpretable Recommendation Explanations [2.6499018693213316]
Natural language explanations in recommender systems are often framed as a review generation task.<n>Fire is a lightweight and interpretable framework that combines SHAP-based feature attribution with structured, prompt-driven language generation.<n>Our results demonstrate that FIRE not only achieves competitive recommendation accuracy but also significantly improves explanation quality along critical dimensions such as alignment, structure, and faithfulness.
arXiv Detail & Related papers (2025-08-07T10:11:02Z)
Too Much to Trust? Measuring the Security and Cognitive Impacts of Explainability in AI-Driven SOCs [0.6990493129893112]
Explainable AI (XAI) holds significant promise for enhancing the transparency and trustworthiness of AI-driven threat detection.<n>This study re-evaluates current explanation methods within security contexts and demonstrates that role-aware, context-rich XAI designs aligned with SOC can substantially improve practical utility.
arXiv Detail & Related papers (2025-03-03T21:39:15Z)
On the Loss of Context-awareness in General Instruction Fine-tuning [101.03941308894191]
We investigate the loss of context awareness after supervised fine-tuning.<n>We find that the performance decline is associated with a bias toward different roles learned during conversational instruction fine-tuning.<n>We propose a metric to identify context-dependent examples from general instruction fine-tuning datasets.
arXiv Detail & Related papers (2024-11-05T00:16:01Z)
Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding [118.75567341513897]
Existing methods typically analyze target text in isolation or solely with non-member contexts.<n>We propose Con-ReCall, a novel approach that leverages the asymmetric distributional shifts induced by member and non-member contexts.
arXiv Detail & Related papers (2024-09-05T09:10:38Z)
Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models [53.337728969143086]
Recommendation systems harness user-item interactions like clicks and reviews to learn their representations. Previous studies improve recommendation accuracy and interpretability by modeling user preferences across various aspects and intents. We introduce a chain-based prompting approach to uncover semantic aspect-aware interactions.
arXiv Detail & Related papers (2023-12-26T15:44:09Z)
Alert-ME: An Explainability-Driven Defense Against Adversarial Examples in Transformer-Based Text Classification [9.818997495801705]
This paper presents a unified framework called Explainability-driven Detection, Identification, and Transformation (EDIT) to strengthen inference-time defenses.<n>EDIT integrates explainability tools, including attention maps and integrated gradients, with frequency-based features to automatically detect and identify adversarial perturbations.<n>The framework provides robust, interpretable, and efficient protection against both standard, zero-day, and adaptive adversarial threats in text classification models.
arXiv Detail & Related papers (2023-07-03T03:17:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.