Textual Data Mining for Financial Fraud Detection: A Deep Learning
Approach
- URL: http://arxiv.org/abs/2308.03800v1
- Date: Sat, 5 Aug 2023 15:33:10 GMT
- Title: Textual Data Mining for Financial Fraud Detection: A Deep Learning
Approach
- Authors: Qiuru Li
- Abstract summary: I present a deep learning approach to conduct a natural language processing (hereafter NLP) binary classification task for analyzing financial-fraud texts.
My methodology involved different kinds of neural network models, including Multilayer Perceptrons with Embedding layers, vanilla Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), and Gated Recurrent Unit (GRU)
My results bring significant implications for financial fraud detection as this work contributes to the growing body of research at the intersection of deep learning, NLP, and finance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this report, I present a deep learning approach to conduct a natural
language processing (hereafter NLP) binary classification task for analyzing
financial-fraud texts. First, I searched for regulatory announcements and
enforcement bulletins from HKEX news to define fraudulent companies and to
extract their MD&A reports before I organized the sentences from the reports
with labels and reporting time. My methodology involved different kinds of
neural network models, including Multilayer Perceptrons with Embedding layers,
vanilla Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), and
Gated Recurrent Unit (GRU) for the text classification task. By utilizing this
diverse set of models, I aim to perform a comprehensive comparison of their
accuracy in detecting financial fraud. My results bring significant
implications for financial fraud detection as this work contributes to the
growing body of research at the intersection of deep learning, NLP, and
finance, providing valuable insights for industry practitioners, regulators,
and researchers in the pursuit of more robust and effective fraud detection
methodologies.
Related papers
- Entity Extraction from High-Level Corruption Schemes via Large Language Models [4.820586736502356]
This article proposes a new micro-benchmark dataset for algorithms and models that identify individuals and organizations in news articles.
Experimental efforts are also reported, using this dataset, to identify individuals and organizations in financial-crime-related articles.
arXiv Detail & Related papers (2024-09-05T10:27:32Z) - Great Memory, Shallow Reasoning: Limits of $k$NN-LMs [71.73611113995143]
$k$NN-LMs, which integrate retrieval with next-word prediction, have demonstrated strong performance in language modeling.
We ask whether this improved ability to recall information really translates into downstream abilities.
arXiv Detail & Related papers (2024-08-21T17:59:05Z) - Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs [49.57641083688934]
We introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings.
Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines.
arXiv Detail & Related papers (2024-06-05T20:19:09Z) - Network Analytics for Anti-Money Laundering -- A Systematic Literature Review and Experimental Evaluation [1.7119723306387908]
This paper presents an extensive and systematic review of the literature on network analytics (NA) for anti-money laundering (AML)
We identify and analyse 97 papers in the Web of Science and Scopus databases, resulting in a taxonomy of approaches following the fraud analytics framework of Bockel-Rickermann et al.
The framework is applied on the publicly available Elliptic data set and implements manual feature engineering, random walk-based methods, and deep learning GNNs.
arXiv Detail & Related papers (2024-05-29T08:48:52Z) - AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework [48.3060010653088]
We release AlphaFin datasets, combining traditional research datasets, real-time financial data, and handwritten chain-of-thought (CoT) data.
We then use AlphaFin datasets to benchmark a state-of-the-art method, called Stock-Chain, for effectively tackling the financial analysis task.
arXiv Detail & Related papers (2024-03-19T09:45:33Z) - Enhancing Illicit Activity Detection using XAI: A Multimodal Graph-LLM
Framework [3.660182910533372]
We present a state-of-the-art, novel multimodal proactive approach to addressing XAI in financial cybercrime detection.
We leverage a triad of deep learning models designed to distill essential representations from transaction sequencing, subgraph connectivity, and narrative generation.
arXiv Detail & Related papers (2023-10-20T19:33:44Z) - Transaction Fraud Detection via an Adaptive Graph Neural Network [64.9428588496749]
We propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection.
A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes.
Experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
arXiv Detail & Related papers (2023-07-11T07:48:39Z) - Fraud Dataset Benchmark and Applications [25.184342958800293]
Fraud dataset Benchmark (FDB) is a compilation of publicly available datasets catered to fraud detection.
FDB comprises variety of fraud related tasks, ranging from identifying fraudulent card-not-present transactions, detecting bot attacks, classifying malicious URLs, estimating risk of loan default to content moderation.
Python based library for FDB provides a consistent API for data loading with standardized training and testing splits.
arXiv Detail & Related papers (2022-08-30T17:35:39Z) - Relational Graph Neural Networks for Fraud Detection in a Super-App
environment [53.561797148529664]
We propose a framework of relational graph convolutional networks methods for fraudulent behaviour prevention in the financial services of a Super-App.
We use an interpretability algorithm for graph neural networks to determine the most important relations to the classification task of the users.
Our results show that there is an added value when considering models that take advantage of the alternative data of the Super-App and the interactions found in their high connectivity.
arXiv Detail & Related papers (2021-07-29T00:02:06Z) - ScoreGAN: A Fraud Review Detector based on Multi Task Learning of
Regulated GAN with Data Augmentation [50.779498955162644]
We propose ScoreGAN for fraud review detection that makes use of both review text and review rating scores in the generation and detection process.
Results show that the proposed framework outperformed the existing state-of-the-art framework, namely FakeGAN, in terms of AP by 7%, and 5% on the Yelp and TripAdvisor datasets.
arXiv Detail & Related papers (2020-06-11T16:15:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.