Explainable AI in Big Data Fraud Detection
- URL: http://arxiv.org/abs/2512.16037v1
- Date: Wed, 17 Dec 2025 23:40:54 GMT
- Title: Explainable AI in Big Data Fraud Detection
- Authors: Ayush Jain, Rahul Kulkarni, Siyi Lin,
- Abstract summary: This paper examines how explainable artificial intelligence (XAI) can be integrated into Big Data analytics pipelines for fraud detection and risk management.<n>We review key Big Data characteristics and survey major analytical tools, including distributed storage systems, streaming platforms, and advanced fraud detection models.<n>We identify key research gaps related to scalability, real-time processing, and explainability for graph and temporal models.<n>The paper concludes with open research directions in scalable XAI, privacy-aware explanations, and standardized evaluation methods for explainable fraud detection systems.
- Score: 3.5429848204449694
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Big Data has become central to modern applications in finance, insurance, and cybersecurity, enabling machine learning systems to perform large-scale risk assessments and fraud detection. However, the increasing dependence on automated analytics introduces important concerns about transparency, regulatory compliance, and trust. This paper examines how explainable artificial intelligence (XAI) can be integrated into Big Data analytics pipelines for fraud detection and risk management. We review key Big Data characteristics and survey major analytical tools, including distributed storage systems, streaming platforms, and advanced fraud detection models such as anomaly detectors, graph-based approaches, and ensemble classifiers. We also present a structured review of widely used XAI methods, including LIME, SHAP, counterfactual explanations, and attention mechanisms, and analyze their strengths and limitations when deployed at scale. Based on these findings, we identify key research gaps related to scalability, real-time processing, and explainability for graph and temporal models. To address these challenges, we outline a conceptual framework that integrates scalable Big Data infrastructure with context-aware explanation mechanisms and human feedback. The paper concludes with open research directions in scalable XAI, privacy-aware explanations, and standardized evaluation methods for explainable fraud detection systems.
Related papers
- Machine learning for fraud detection in digital banking: a systematic literature review REVIEW [0.0]
This systematic literature review examines the role of machine learning in fraud detection within digital banking.<n>Following the PRISMA guidelines, the review applied a structured identification, screening, eligibility, and inclusion process to ensure methodological rigor and transparency.
arXiv Detail & Related papers (2025-10-04T22:56:12Z) - Explaining What Machines See: XAI Strategies in Deep Object Detection Models [0.0]
Explainable Artificial Intelligence (XAI) aims to make model decisions more transparent, interpretable, and trust-worthy for humans.<n>This review provides a comprehensive analysis of state-of-the-art explainability methods specifically applied to object detection models.
arXiv Detail & Related papers (2025-09-02T06:16:30Z) - ProteuS: A Generative Approach for Simulating Concept Drift in Financial Markets [44.76567557906836]
A fundamental problem in developing and validating adaptive algorithms is the lack of a ground truth in real-world financial data.<n>This paper introduces a novel framework, named ProteuS, for generating semi-synthetic financial time series with pre-defined structural breaks.<n>An analysis of the generated data confirms the complexity of the task, revealing significant overlap between the different market states.
arXiv Detail & Related papers (2025-08-30T21:01:47Z) - Anomaly Detection and Generation with Diffusion Models: A Survey [51.61574868316922]
Anomaly detection (AD) plays a pivotal role across diverse domains, including cybersecurity, finance, healthcare, and industrial manufacturing.<n>Recent advancements in deep learning, specifically diffusion models (DMs), have sparked significant interest.<n>This survey aims to guide researchers and practitioners in leveraging DMs for innovative AD solutions across diverse applications.
arXiv Detail & Related papers (2025-06-11T03:29:18Z) - Does Machine Unlearning Truly Remove Knowledge? [80.83986295685128]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z) - Adversarial Challenges in Network Intrusion Detection Systems: Research Insights and Future Prospects [0.33554367023486936]
This paper provides a comprehensive review of machine learning-based Network Intrusion Detection Systems (NIDS)
We critically examine existing research in NIDS, highlighting key trends, strengths, and limitations.
We discuss emerging challenges in the field and offer insights for the development of more robust and resilient NIDS.
arXiv Detail & Related papers (2024-09-27T13:27:29Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - It Is Time To Steer: A Scalable Framework for Analysis-driven Attack Graph Generation [50.06412862964449]
Attack Graph (AG) represents the best-suited solution to support cyber risk assessment for multi-step attacks on computer networks.
Current solutions propose to address the generation problem from the algorithmic perspective and postulate the analysis only after the generation is complete.
This paper rethinks the classic AG analysis through a novel workflow in which the analyst can query the system anytime.
arXiv Detail & Related papers (2023-12-27T10:44:58Z) - Progressing from Anomaly Detection to Automated Log Labeling and
Pioneering Root Cause Analysis [53.24804865821692]
This study introduces a taxonomy for log anomalies and explores automated data labeling to mitigate labeling challenges.
The study envisions a future where root cause analysis follows anomaly detection, unraveling the underlying triggers of anomalies.
arXiv Detail & Related papers (2023-12-22T15:04:20Z) - Explainable Intrusion Detection Systems (X-IDS): A Survey of Current
Methods, Challenges, and Opportunities [0.0]
Intrusion Detection Systems (IDS) have received widespread adoption due to their ability to handle vast amounts of data with a high prediction accuracy.
IDSs designed using Deep Learning (DL) techniques are often treated as black box models and do not provide a justification for their predictions.
This survey reviews the state-of-the-art in explainable AI (XAI) for IDS, its current challenges, and discusses how these challenges span to the design of an X-IDS.
arXiv Detail & Related papers (2022-07-13T14:31:46Z) - How Can Subgroup Discovery Help AIOps? [0.0]
We study how Subgroup Discovery can help AIOps.
This project involves both data mining researchers and practitioners from Infologic, a French software editor.
arXiv Detail & Related papers (2021-09-10T14:41:02Z) - Counterfactual Explanations as Interventions in Latent Space [62.997667081978825]
Counterfactual explanations aim to provide to end users a set of features that need to be changed in order to achieve a desired outcome.
Current approaches rarely take into account the feasibility of actions needed to achieve the proposed explanations.
We present Counterfactual Explanations as Interventions in Latent Space (CEILS), a methodology to generate counterfactual explanations.
arXiv Detail & Related papers (2021-06-14T20:48:48Z) - A new interpretable unsupervised anomaly detection method based on
residual explanation [47.187609203210705]
We present RXP, a new interpretability method to deal with the limitations for AE-based AD in large-scale systems.
It stands out for its implementation simplicity, low computational cost and deterministic behavior.
In an experiment using data from a real heavy-haul railway line, the proposed method achieved superior performance compared to SHAP.
arXiv Detail & Related papers (2021-03-14T15:35:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.