Application of AI-based Models for Online Fraud Detection and Analysis
- URL: http://arxiv.org/abs/2409.19022v1
- Date: Wed, 25 Sep 2024 14:47:03 GMT
- Title: Application of AI-based Models for Online Fraud Detection and Analysis
- Authors: Antonis Papasavva, Shane Johnson, Ed Lowther, Samantha Lundrigan, Enrico Mariconti, Anna Markovska, Nilufer Tuptuk,
- Abstract summary: We conduct a Systematic Literature Review on AI and NLP techniques for online fraud detection.
We report the state-of-the-art NLP techniques for analysing various online fraud categories.
We identify issues in data limitations, training bias reporting, and selective presentation of metrics in model performance reporting.
- Score: 1.764243259740255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fraud is a prevalent offence that extends beyond financial loss, causing psychological and physical harm to victims. The advancements in online communication technologies alowed for online fraud to thrive in this vast network, with fraudsters increasingly using these channels for deception. With the progression of technologies like AI, there is a growing concern that fraud will scale up, using sophisticated methods, like deep-fakes in phishing campaigns, all generated by language generation models like ChatGPT. However, the application of AI in detecting and analyzing online fraud remains understudied. We conduct a Systematic Literature Review on AI and NLP techniques for online fraud detection. The review adhered the PRISMA-ScR protocol, with eligibility criteria including relevance to online fraud, use of text data, and AI methodologies. We screened 2,457 academic records, 350 met our eligibility criteria, and included 223. We report the state-of-the-art NLP techniques for analysing various online fraud categories; the training data sources; the NLP algorithms and models built; and the performance metrics employed for model evaluation. We find that current research on online fraud is divided into various scam activitiesand identify 16 different frauds that researchers focus on. This SLR enhances the academic understanding of AI-based detection methods for online fraud and offers insights for policymakers, law enforcement, and businesses on safeguarding against such activities. We conclude that focusing on specific scams lacks generalization, as multiple models are required for different fraud types. The evolving nature of scams limits the effectiveness of models trained on outdated data. We also identify issues in data limitations, training bias reporting, and selective presentation of metrics in model performance reporting, which can lead to potential biases in model evaluation.
Related papers
- Verification of Machine Unlearning is Fragile [48.71651033308842]
We introduce two novel adversarial unlearning processes capable of circumventing both types of verification strategies.
This study highlights the vulnerabilities and limitations in machine unlearning verification, paving the way for further research into the safety of machine unlearning.
arXiv Detail & Related papers (2024-08-01T21:37:10Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Combatting Human Trafficking in the Cyberspace: A Natural Language
Processing-Based Methodology to Analyze the Language in Online Advertisements [55.2480439325792]
This project tackles the pressing issue of human trafficking in online C2C marketplaces through advanced Natural Language Processing (NLP) techniques.
We introduce a novel methodology for generating pseudo-labeled datasets with minimal supervision, serving as a rich resource for training state-of-the-art NLP models.
A key contribution is the implementation of an interpretability framework using Integrated Gradients, providing explainable insights crucial for law enforcement.
arXiv Detail & Related papers (2023-11-22T02:45:01Z) - Credit Card Fraud Detection with Subspace Learning-based One-Class
Classification [18.094622095967328]
One-Class Classification (OCC) algorithms excel in handling imbalanced data distributions.
These algorithms integrate subspace learning into the data description.
These algorithms transform the data into a lower-dimensional subspace optimized for OCC.
arXiv Detail & Related papers (2023-09-26T12:26:28Z) - An engine to simulate insurance fraud network data [1.3812010983144802]
We develop a simulation machine that is engineered to create synthetic data with a network structure.
We can specify the total number of policyholders and parties, the desired level of imbalance and the (effect size of the) features in the fraud generating model.
The simulation engine enables researchers and practitioners to examine several methodological challenges as well as to test their (development strategy of) insurance fraud detection models.
arXiv Detail & Related papers (2023-08-21T13:14:00Z) - Transaction Fraud Detection via an Adaptive Graph Neural Network [64.9428588496749]
We propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection.
A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes.
Experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
arXiv Detail & Related papers (2023-07-11T07:48:39Z) - Human-Centric Multimodal Machine Learning: Recent Advances and Testbed
on AI-based Recruitment [66.91538273487379]
There is a certain consensus about the need to develop AI applications with a Human-Centric approach.
Human-Centric Machine Learning needs to be developed based on four main requirements: (i) utility and social good; (ii) privacy and data ownership; (iii) transparency and accountability; and (iv) fairness in AI-driven decision-making processes.
We study how current multimodal algorithms based on heterogeneous sources of information are affected by sensitive elements and inner biases in the data.
arXiv Detail & Related papers (2023-02-13T16:44:44Z) - Fraud Dataset Benchmark and Applications [25.184342958800293]
Fraud dataset Benchmark (FDB) is a compilation of publicly available datasets catered to fraud detection.
FDB comprises variety of fraud related tasks, ranging from identifying fraudulent card-not-present transactions, detecting bot attacks, classifying malicious URLs, estimating risk of loan default to content moderation.
Python based library for FDB provides a consistent API for data loading with standardized training and testing splits.
arXiv Detail & Related papers (2022-08-30T17:35:39Z) - Challenges and Complexities in Machine Learning based Credit Card Fraud
Detection [0.0]
Volume of transactions, uniqueness of frauds and ingenuity of the fraudster are main challenges in detecting frauds.
The advent of machine learning, artificial intelligence and big data has opened up new tools in the fight against frauds.
However, the developments in fraud detection algorithms has been challenging and slow due to the massively unbalanced nature of fraud data.
arXiv Detail & Related papers (2022-08-20T07:53:51Z) - Relational Graph Neural Networks for Fraud Detection in a Super-App
environment [53.561797148529664]
We propose a framework of relational graph convolutional networks methods for fraudulent behaviour prevention in the financial services of a Super-App.
We use an interpretability algorithm for graph neural networks to determine the most important relations to the classification task of the users.
Our results show that there is an added value when considering models that take advantage of the alternative data of the Super-App and the interactions found in their high connectivity.
arXiv Detail & Related papers (2021-07-29T00:02:06Z) - Social network analytics for supervised fraud detection in insurance [1.911867365776962]
Insurance fraud occurs when policyholders file claims that are exaggerated or based on intentional damages.
This contribution develops a fraud detection strategy by extracting insightful information from the social network of a claim.
arXiv Detail & Related papers (2020-09-15T21:40:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.