Credit card fraud detection - Classifier selection strategy
- URL: http://arxiv.org/abs/2208.11900v1
- Date: Thu, 25 Aug 2022 07:13:42 GMT
- Title: Credit card fraud detection - Classifier selection strategy
- Authors: Gayan K. Kulatilleke
- Abstract summary: Using a sample of annotated transactions, a machine learning classification algorithm learns to detect frauds.
fraud data sets are diverse and exhibit inconsistent characteristics.
We propose a data-driven classifier selection strategy for characteristic highly imbalanced fraud detection data sets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning has opened up new tools for financial fraud detection. Using
a sample of annotated transactions, a machine learning classification algorithm
learns to detect frauds. With growing credit card transaction volumes and
rising fraud percentages there is growing interest in finding appropriate
machine learning classifiers for detection. However, fraud data sets are
diverse and exhibit inconsistent characteristics. As a result, a model
effective on a given data set is not guaranteed to perform on another. Further,
the possibility of temporal drift in data patterns and characteristics over
time is high. Additionally, fraud data has massive and varying imbalance. In
this work, we evaluate sampling methods as a viable pre-processing mechanism to
handle imbalance and propose a data-driven classifier selection strategy for
characteristic highly imbalanced fraud detection data sets. The model derived
based on our selection strategy surpasses peer models, whilst working in more
realistic conditions, establishing the effectiveness of the strategy.
Related papers
- Verification of Machine Unlearning is Fragile [48.71651033308842]
We introduce two novel adversarial unlearning processes capable of circumventing both types of verification strategies.
This study highlights the vulnerabilities and limitations in machine unlearning verification, paving the way for further research into the safety of machine unlearning.
arXiv Detail & Related papers (2024-08-01T21:37:10Z) - Explainable Fraud Detection with Deep Symbolic Classification [4.1205832766381985]
We present Deep Classification, an extension of the Deep Symbolic Regression framework to classification problems.
Because the functions are mathematical expressions that are in closed-form and concise, the model is inherently explainable both at the level of a single classification decision and the model's decision process.
An evaluation on the PaySim data set demonstrates competitive predictive performance with state-of-the-art models, while surpassing them in terms of explainability.
arXiv Detail & Related papers (2023-12-01T13:50:55Z) - An engine to simulate insurance fraud network data [1.3812010983144802]
We develop a simulation machine that is engineered to create synthetic data with a network structure.
We can specify the total number of policyholders and parties, the desired level of imbalance and the (effect size of the) features in the fraud generating model.
The simulation engine enables researchers and practitioners to examine several methodological challenges as well as to test their (development strategy of) insurance fraud detection models.
arXiv Detail & Related papers (2023-08-21T13:14:00Z) - Transaction Fraud Detection via an Adaptive Graph Neural Network [64.9428588496749]
We propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection.
A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes.
Experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
arXiv Detail & Related papers (2023-07-11T07:48:39Z) - Credit Card Fraud Detection Using Enhanced Random Forest Classifier for
Imbalanced Data [0.8223798883838329]
This paper implements the random forest (RF) algorithm to solve the issue in the hand.
A dataset of credit card transactions was used in this study.
arXiv Detail & Related papers (2023-03-11T22:59:37Z) - Canary in a Coalmine: Better Membership Inference with Ensembled
Adversarial Queries [53.222218035435006]
We use adversarial tools to optimize for queries that are discriminative and diverse.
Our improvements achieve significantly more accurate membership inference than existing methods.
arXiv Detail & Related papers (2022-10-19T17:46:50Z) - Empirical study of Machine Learning Classifier Evaluation Metrics
behavior in Massively Imbalanced and Noisy data [0.0]
We develop a theoretical foundation to model human annotation errors and extreme imbalance typical in real world fraud detection data sets.
We demonstrate that a combined F1 score and g-mean, in that specific order, is the best evaluation metric for typical imbalanced fraud detection model classification.
arXiv Detail & Related papers (2022-08-25T07:30:31Z) - Challenges and Complexities in Machine Learning based Credit Card Fraud
Detection [0.0]
Volume of transactions, uniqueness of frauds and ingenuity of the fraudster are main challenges in detecting frauds.
The advent of machine learning, artificial intelligence and big data has opened up new tools in the fight against frauds.
However, the developments in fraud detection algorithms has been challenging and slow due to the massively unbalanced nature of fraud data.
arXiv Detail & Related papers (2022-08-20T07:53:51Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Semi-supervised Long-tailed Recognition using Alternate Sampling [95.93760490301395]
Main challenges in long-tailed recognition come from the imbalanced data distribution and sample scarcity in its tail classes.
We propose a new recognition setting, namely semi-supervised long-tailed recognition.
We demonstrate significant accuracy improvements over other competitive methods on two datasets.
arXiv Detail & Related papers (2021-05-01T00:43:38Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.