Learning from sanctioned government suppliers: A machine learning and network science approach to detecting fraud and corruption in Mexico
- URL: http://arxiv.org/abs/2512.19491v2
- Date: Fri, 26 Dec 2025 12:18:45 GMT
- Title: Learning from sanctioned government suppliers: A machine learning and network science approach to detecting fraud and corruption in Mexico
- Authors: Martí Medina-Hernández, Janos Kertész, Mihály Fazekas,
- Abstract summary: This study implements positive-unlabeled (PU) learning algorithms that integrate domain-knowledge-based red flags with network-derived features to identify likely corrupt and fraudulent contracts.<n>The best-performing PU model on average captures 32 percent more known positives and performs on average 2.3 times better than random guessing.<n>This methodology can support law enforcement in Mexico, and it can be adapted to other national contexts too.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Detecting fraud and corruption in public procurement remains a major challenge for governments worldwide. Most research to-date builds on domain-knowledge-based corruption risk indicators of individual contract-level features and some also analyzes contracting network patterns. A critical barrier for supervised machine learning is the absence of confirmed non-corrupt, negative, examples, which makes conventional machine learning inappropriate for this task. Using publicly available data on federally funded procurement in Mexico and company sanction records, this study implements positive-unlabeled (PU) learning algorithms that integrate domain-knowledge-based red flags with network-derived features to identify likely corrupt and fraudulent contracts. The best-performing PU model on average captures 32 percent more known positives and performs on average 2.3 times better than random guessing, substantially outperforming approaches based solely on traditional red flags. The analysis of the Shapley Additive Explanations reveals that network-derived features, particularly those associated with contracts in the network core or suppliers with high eigenvector centrality, are the most important. Traditional red flags further enhance model performance in line with expectations, albeit mainly for contracts awarded through competitive tenders. This methodology can support law enforcement in Mexico, and it can be adapted to other national contexts too.
Related papers
- Financial Fraud Identification and Interpretability Study for Listed Companies Based on Convolutional Neural Network [4.504327589607446]
This paper proposes a financial fraud detection framework for Chinese A-share listed companies based on convolutional neural networks (CNNs)<n> Experiments show that the CNN outperforms logistic regression and LightGBM in accuracy, robustness, and early-warning performance.<n>We find that solvency, ratio structure, governance structure, and internal control are general predictors of fraud, while environmental indicators matter mainly in high-pollution industries.
arXiv Detail & Related papers (2025-12-07T04:14:16Z) - Deep Learning Approaches for Anti-Money Laundering on Mobile Transactions: Review, Framework, and Directions [51.43521977132062]
Money laundering is a financial crime that obscures the origin of illicit funds.<n>The proliferation of mobile payment platforms and smart IoT devices has significantly complicated anti-money laundering investigations.<n>This paper conducts a comprehensive review of deep learning solutions and the challenges associated with their use in AML.
arXiv Detail & Related papers (2025-03-13T05:19:44Z) - Corporate Fraud Detection in Rich-yet-Noisy Financial Graph [13.061327697762287]
Corporate fraud detection aims to automatically recognize companies that conduct wrongful activities such as fraudulent financial statements or illegal insider trading.<n>Previous learning-based methods fail to effectively integrate rich interactions in the company network.<n>We analyze 18-year financial records in China to form three graph datasets with fraud labels.
arXiv Detail & Related papers (2025-02-26T17:05:54Z) - Networks of Networks: Complexity Class Principles Applied to Compound AI Systems Design [63.24275274981911]
Compound AI Systems consisting of many language model inference calls are increasingly employed.
In this work, we construct systems, which we call Networks of Networks (NoNs) organized around the distinction between generating a proposed answer and verifying its correctness.
We introduce a verifier-based judge NoN with K generators, an instantiation of "best-of-K" or "judge-based" compound AI systems.
arXiv Detail & Related papers (2024-07-23T20:40:37Z) - Transaction Fraud Detection via an Adaptive Graph Neural Network [64.9428588496749]
We propose an Adaptive Sampling and Aggregation-based Graph Neural Network (ASA-GNN) that learns discriminative representations to improve the performance of transaction fraud detection.
A neighbor sampling strategy is performed to filter noisy nodes and supplement information for fraudulent nodes.
Experiments on three real financial datasets demonstrate that the proposed method ASA-GNN outperforms state-of-the-art ones.
arXiv Detail & Related papers (2023-07-11T07:48:39Z) - Free Lunch for Generating Effective Outlier Supervision [46.37464572099351]
We propose an ultra-effective method to generate near-realistic outlier supervision.
Our proposed textttBayesAug significantly reduces the false positive rate over 12.50% compared with the previous schemes.
arXiv Detail & Related papers (2023-01-17T01:46:45Z) - A machine learning model to identify corruption in M\'exico's public
procurement contracts [0.0]
This paper proposes a machine learning model to identify and predict corrupt contracts in M'exico's public procurement data.
We found that the most critical predictors considered in the model are those related to the relationship between buyers and suppliers.
Our work presents a tool that can help in the decision-making process to identify, predict and analyze corruption in public procurement contracts.
arXiv Detail & Related papers (2022-10-25T01:22:41Z) - Relational Graph Neural Networks for Fraud Detection in a Super-App
environment [53.561797148529664]
We propose a framework of relational graph convolutional networks methods for fraudulent behaviour prevention in the financial services of a Super-App.
We use an interpretability algorithm for graph neural networks to determine the most important relations to the classification task of the users.
Our results show that there is an added value when considering models that take advantage of the alternative data of the Super-App and the interactions found in their high connectivity.
arXiv Detail & Related papers (2021-07-29T00:02:06Z) - On the Importance of Regularisation & Auxiliary Information in OOD
Detection [9.340611077939828]
This deficiency demonstrates a fundamental flaw indicating that neural networks often overfit on spurious correlations.
We present two novel objectives that improve the ability of a network to detect out-of-distribution samples.
arXiv Detail & Related papers (2021-07-15T18:57:10Z) - ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep
Neural Network and Transfer Learning [80.85273827468063]
Existing machine learning-based vulnerability detection methods are limited and only inspect whether the smart contract is vulnerable.
We propose ESCORT, the first Deep Neural Network (DNN)-based vulnerability detection framework for smart contracts.
We show that ESCORT achieves an average F1-score of 95% on six vulnerability types and the detection time is 0.02 seconds per contract.
arXiv Detail & Related papers (2021-03-23T15:04:44Z) - Cross-ethnicity Face Anti-spoofing Recognition Challenge: A Review [79.49390241265337]
Chalearn Face Anti-spoofing Attack Detection Challenge consists of single-modal (e.g., RGB) and multi-modal (e.g., RGB, Depth, Infrared (IR)) tracks.
This paper presents an overview of the challenge, including its design, evaluation protocol and a summary of results.
arXiv Detail & Related papers (2020-04-23T06:43:08Z) - A Semi-supervised Graph Attentive Network for Financial Fraud Detection [30.645390612737266]
We propose a semi-supervised attentive graph neural network, namedSemiSemiGNN, to utilize the multi-view labeled and unlabeled data for fraud detection.
By utilizing the social relations and the user attributes, our method can achieve a better accuracy compared with the state-of-the-art methods on two tasks.
arXiv Detail & Related papers (2020-02-28T10:35:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.