Graph Feature Preprocessor: Real-time Subgraph-based Feature Extraction for Financial Crime Detection
- URL: http://arxiv.org/abs/2402.08593v2
- Date: Thu, 03 Oct 2024 09:38:16 GMT
- Title: Graph Feature Preprocessor: Real-time Subgraph-based Feature Extraction for Financial Crime Detection
- Authors: Jovan Blanuša, Maximo Cravero Baraja, Andreea Anghel, Luc von Niederhäusern, Erik Altman, Haris Pozidis, Kubilay Atasu,
- Abstract summary: "Graph Feature Preprocessor" is a software library for detecting typical money laundering patterns in financial transaction graphs in real time.
We show that our enriched transaction features dramatically improve the prediction accuracy of gradient-boosting-based machine learning models.
Our solution can detect illicit transactions with higher minority-class F1 scores than standard graph neural networks in anti-money laundering and phishing datasets.
- Score: 2.1140460101878107
- License:
- Abstract: In this paper, we present "Graph Feature Preprocessor", a software library for detecting typical money laundering patterns in financial transaction graphs in real time. These patterns are used to produce a rich set of transaction features for downstream machine learning training and inference tasks such as detection of fraudulent financial transactions. We show that our enriched transaction features dramatically improve the prediction accuracy of gradient-boosting-based machine learning models. Our library exploits multicore parallelism, maintains a dynamic in-memory graph, and efficiently mines subgraph patterns in the incoming transaction stream, which enables it to be operated in a streaming manner. Our solution, which combines our Graph Feature Preprocessor and gradient-boosting-based machine learning models, can detect illicit transactions with higher minority-class F1 scores than standard graph neural networks in anti-money laundering and phishing datasets. In addition, the end-to-end throughput rate of our solution executed on a multicore CPU outperforms the graph neural network baselines executed on a powerful V100 GPU. Overall, the combination of high accuracy, a high throughput rate, and low latency of our solution demonstrates the practical value of our library in real-world applications.
Related papers
- Identifying Money Laundering Subgraphs on the Blockchain [5.377744640870357]
Anti-Money Laundering involves the identification of money laundering crimes in financial activities.
Recent studies advanced AML through the lens of graph-based machine learning.
RevTrack is a graph-based framework that enables large-scale AML analysis with a lower cost and a higher accuracy.
arXiv Detail & Related papers (2024-10-10T22:01:14Z) - The Shape of Money Laundering: Subgraph Representation Learning on the Blockchain with the Elliptic2 Dataset [6.209290101460395]
Subgraph representation learning is a technique for analyzing local structures (or shapes) within complex networks.
We introduce Elliptic2, a large graph dataset containing 122K labeled subgraphs of Bitcoin clusters.
We find immediate practical value in this approach and the potential for a new standard in anti-money laundering and forensic analytics in cryptocurrencies.
arXiv Detail & Related papers (2024-04-29T21:19:41Z) - Towards a Foundation Purchasing Model: Pretrained Generative
Autoregression on Transaction Sequences [0.0]
We present a generative pretraining method that can be used to obtain contextualised embeddings of financial transactions.
We additionally perform large-scale pretraining of an embedding model using a corpus of data from 180 issuing banks containing 5.1 billion transactions.
arXiv Detail & Related papers (2024-01-03T09:32:48Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - MultiScale MeshGraphNets [65.26373813797409]
We propose two complementary approaches to improve the framework from MeshGraphNets.
First, we demonstrate that it is possible to learn accurate surrogate dynamics of a high-resolution system on a much coarser mesh.
Second, we introduce a hierarchical approach (MultiScale MeshGraphNets) which passes messages on two different resolutions.
arXiv Detail & Related papers (2022-10-02T20:16:20Z) - Learning Graph Structure from Convolutional Mixtures [119.45320143101381]
We propose a graph convolutional relationship between the observed and latent graphs, and formulate the graph learning task as a network inverse (deconvolution) problem.
In lieu of eigendecomposition-based spectral methods, we unroll and truncate proximal gradient iterations to arrive at a parameterized neural network architecture that we call a Graph Deconvolution Network (GDN)
GDNs can learn a distribution of graphs in a supervised fashion, perform link prediction or edge-weight regression tasks by adapting the loss function, and they are inherently inductive.
arXiv Detail & Related papers (2022-05-19T14:08:15Z) - Graph Kernel Neural Networks [53.91024360329517]
We propose to use graph kernels, i.e. kernel functions that compute an inner product on graphs, to extend the standard convolution operator to the graph domain.
This allows us to define an entirely structural model that does not require computing the embedding of the input graph.
Our architecture allows to plug-in any type of graph kernels and has the added benefit of providing some interpretability.
arXiv Detail & Related papers (2021-12-14T14:48:08Z) - Temporal Graph Network Embedding with Causal Anonymous Walks
Representations [54.05212871508062]
We propose a novel approach for dynamic network representation learning based on Temporal Graph Network.
For evaluation, we provide a benchmark pipeline for the evaluation of temporal network embeddings.
We show the applicability and superior performance of our model in the real-world downstream graph machine learning task provided by one of the top European banks.
arXiv Detail & Related papers (2021-08-19T15:39:52Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Not Half Bad: Exploring Half-Precision in Graph Convolutional Neural
Networks [8.460826851547294]
efficient graph analysis using modern machine learning is receiving a growing level of attention.
Deep learning approaches often operate over the entire adjacency matrix.
It is desirable to identify efficient measures to reduce both run-time and memory requirements.
arXiv Detail & Related papers (2020-10-23T19:47:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.