VEXA: Evidence-Grounded and Persona-Adaptive Explanations for Scam Risk Sensemaking
- URL: http://arxiv.org/abs/2602.05056v1
- Date: Wed, 04 Feb 2026 21:16:24 GMT
- Title: VEXA: Evidence-Grounded and Persona-Adaptive Explanations for Scam Risk Sensemaking
- Authors: Heajun An, Connor Ng, Sandesh Sharma Dulal, Junghwan Kim, Jin-Hee Cho,
- Abstract summary: Online scams across email, short message services, and social media increasingly challenge everyday risk assessment.<n>We propose VEXA, an evidence-grounded and persona-adaptive framework for generating learner-facing scam explanations.
- Score: 9.22587207148122
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online scams across email, short message services, and social media increasingly challenge everyday risk assessment, particularly as generative AI enables more fluent and context-aware deception. Although transformer-based detectors achieve strong predictive performance, their explanations are often opaque to non-experts or misaligned with model decisions. We propose VEXA, an evidence-grounded and persona-adaptive framework for generating learner-facing scam explanations by integrating GradientSHAP-based attribution with theory-informed vulnerability personas. Evaluation across multi-channel datasets shows that grounding explanations in detector-derived evidence improves semantic reliability without increasing linguistic complexity, while persona conditioning introduces interpretable stylistic variation without disrupting evidential alignment. These results reveal a key design insight: evidential grounding governs semantic correctness, whereas persona-based adaptation operates at the level of presentation under constraints of faithfulness. Together, VEXA demonstrates the feasibility of persona-adaptive, evidence-grounded explanations and provides design guidance for trustworthy, learner-facing security explanations in non-formal contexts.
Related papers
- XMENTOR: A Rank-Aware Aggregation Approach for Human-Centered Explainable AI in Just-in-Time Software Defect Prediction [5.646457568088472]
We introduce XMENTOR, a human-centered, rank-aware aggregation method implemented as a VS Code plugin.<n>XMENTOR unifies multiple post-hoc explanations into a single, coherent view by applying adaptive thresholding, rank and sign agreement.<n>Our findings show how combining explanations and embedding them into developer can enhance interpretability, usability, and trust.
arXiv Detail & Related papers (2026-02-25T20:54:49Z) - From Passive Metric to Active Signal: The Evolving Role of Uncertainty Quantification in Large Language Models [77.04403907729738]
This survey charts the evolution of uncertainty from a passive diagnostic metric to an active control signal guiding real-time model behavior.<n>We demonstrate how uncertainty is leveraged as an active control signal across three frontiers.<n>This survey argues that mastering the new trend of uncertainty is essential for building the next generation of scalable, reliable, and trustworthy AI.
arXiv Detail & Related papers (2026-01-22T06:21:31Z) - REFLEX: Self-Refining Explainable Fact-Checking via Disentangling Truth into Style and Substance [14.932352020762991]
We propose REason-guided Fact-checking with Latent EXplanations REFLEX paradigm.<n>It is a plug-and-play, self-refining paradigm that leverages the internal knowledge in backbone model to improve both verdict accuracy and explanation quality.<n>With only 465 self-refined training samples, RELFEX achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-11-25T12:06:23Z) - Towards Transparent Stance Detection: A Zero-Shot Approach Using Implicit and Explicit Interpretability [12.794773087413256]
Zero-Shot Stance Detection (ZSSD) identifies the attitude of the post toward unseen targets.<n>IRIS considers stance detection as an information retrieval ranking task.<n>explicit rationales based on communicative features help decode the emotional and cognitive dimensions of stance.
arXiv Detail & Related papers (2025-11-05T16:54:10Z) - Exploring Semantic-constrained Adversarial Example with Instruction Uncertainty Reduction [51.50282796099369]
This paper develops a multi-dimensional instruction uncertainty reduction framework to generate semantically constrained adversarial examples.<n>By predicting the language-guided sampling process, the optimization process will be stabilized by the designed ResAdv-DDIM sampler.<n>We realize the reference-free generation of semantically constrained 3D adversarial examples for the first time.
arXiv Detail & Related papers (2025-10-27T04:02:52Z) - Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial Alignment [48.73836179661632]
Confidence-guided adversarial alignment adaptively suppresses attack-specific artifacts without erasing discriminative cues.<n>IB-CAAN consistently outperforms baseline and state-of-the-art performance on many benchmarks.
arXiv Detail & Related papers (2025-09-28T03:48:49Z) - FIRE: Faithful Interpretable Recommendation Explanations [2.6499018693213316]
Natural language explanations in recommender systems are often framed as a review generation task.<n>Fire is a lightweight and interpretable framework that combines SHAP-based feature attribution with structured, prompt-driven language generation.<n>Our results demonstrate that FIRE not only achieves competitive recommendation accuracy but also significantly improves explanation quality along critical dimensions such as alignment, structure, and faithfulness.
arXiv Detail & Related papers (2025-08-07T10:11:02Z) - Too Much to Trust? Measuring the Security and Cognitive Impacts of Explainability in AI-Driven SOCs [0.6990493129893112]
Explainable AI (XAI) holds significant promise for enhancing the transparency and trustworthiness of AI-driven threat detection.<n>This study re-evaluates current explanation methods within security contexts and demonstrates that role-aware, context-rich XAI designs aligned with SOC can substantially improve practical utility.
arXiv Detail & Related papers (2025-03-03T21:39:15Z) - On the Loss of Context-awareness in General Instruction Fine-tuning [101.03941308894191]
We investigate the loss of context awareness after supervised fine-tuning.<n>We find that the performance decline is associated with a bias toward different roles learned during conversational instruction fine-tuning.<n>We propose a metric to identify context-dependent examples from general instruction fine-tuning datasets.
arXiv Detail & Related papers (2024-11-05T00:16:01Z) - Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding [118.75567341513897]
Existing methods typically analyze target text in isolation or solely with non-member contexts.<n>We propose Con-ReCall, a novel approach that leverages the asymmetric distributional shifts induced by member and non-member contexts.
arXiv Detail & Related papers (2024-09-05T09:10:38Z) - Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models [53.337728969143086]
Recommendation systems harness user-item interactions like clicks and reviews to learn their representations.
Previous studies improve recommendation accuracy and interpretability by modeling user preferences across various aspects and intents.
We introduce a chain-based prompting approach to uncover semantic aspect-aware interactions.
arXiv Detail & Related papers (2023-12-26T15:44:09Z) - Alert-ME: An Explainability-Driven Defense Against Adversarial Examples in Transformer-Based Text Classification [9.818997495801705]
This paper presents a unified framework called Explainability-driven Detection, Identification, and Transformation (EDIT) to strengthen inference-time defenses.<n>EDIT integrates explainability tools, including attention maps and integrated gradients, with frequency-based features to automatically detect and identify adversarial perturbations.<n>The framework provides robust, interpretable, and efficient protection against both standard, zero-day, and adaptive adversarial threats in text classification models.
arXiv Detail & Related papers (2023-07-03T03:17:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.