A machine learning model to identify corruption in M\'exico's public
procurement contracts
- URL: http://arxiv.org/abs/2211.01478v2
- Date: Wed, 14 Dec 2022 20:17:10 GMT
- Title: A machine learning model to identify corruption in M\'exico's public
procurement contracts
- Authors: Andr\'es Aldana, Andrea Falc\'on-Cort\'es and Hern\'an Larralde
- Abstract summary: This paper proposes a machine learning model to identify and predict corrupt contracts in M'exico's public procurement data.
We found that the most critical predictors considered in the model are those related to the relationship between buyers and suppliers.
Our work presents a tool that can help in the decision-making process to identify, predict and analyze corruption in public procurement contracts.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The costs and impacts of government corruption range from impairing a
country's economic growth to affecting its citizens' well-being and safety.
Public contracting between government dependencies and private sector
instances, referred to as public procurement, is a fertile land of opportunity
for corrupt practices, generating substantial monetary losses worldwide. Thus,
identifying and deterring corrupt activities between the government and the
private sector is paramount. However, due to several factors, corruption in
public procurement is challenging to identify and track, leading to corrupt
practices going unnoticed. This paper proposes a machine learning model based
on an ensemble of random forest classifiers, which we call hyper-forest, to
identify and predict corrupt contracts in M\'exico's public procurement data.
This method's results correctly detect most of the corrupt and non-corrupt
contracts evaluated in the dataset. Furthermore, we found that the most
critical predictors considered in the model are those related to the
relationship between buyers and suppliers rather than those related to features
of individual contracts. Also, the method proposed here is general enough to be
trained with data from other countries. Overall, our work presents a tool that
can help in the decision-making process to identify, predict and analyze
corruption in public procurement contracts.
Related papers
- Differentially Private Data Release on Graphs: Inefficiencies and Unfairness [48.96399034594329]
This paper characterizes the impact of Differential Privacy on bias and unfairness in the context of releasing information about networks.
We consider a network release problem where the network structure is known to all, but the weights on edges must be released privately.
Our work provides theoretical foundations and empirical evidence into the bias and unfairness arising due to privacy in these networked decision problems.
arXiv Detail & Related papers (2024-08-08T08:37:37Z) - The Impact of Differential Feature Under-reporting on Algorithmic Fairness [86.275300739926]
We present an analytically tractable model of differential feature under-reporting.
We then use to characterize the impact of this kind of data bias on algorithmic fairness.
Our results show that, in real world data settings, under-reporting typically leads to increasing disparities.
arXiv Detail & Related papers (2024-01-16T19:16:22Z) - Locally Differentially Private Embedding Models in Distributed Fraud
Prevention Systems [2.001149416674759]
We present a collaborative deep learning framework for fraud prevention, designed from a privacy standpoint, and awarded at the recent PETs Prize Challenges.
We leverage latent embedded representations of varied-length transaction sequences, along with local differential privacy, in order to construct a data release mechanism which can securely inform externally hosted fraud and anomaly detection models.
We assess our contribution on two distributed data sets donated by large payment networks, and demonstrate robustness to popular inference-time attacks, along with utility-privacy trade-offs analogous to published work in alternative application domains.
arXiv Detail & Related papers (2024-01-03T14:04:18Z) - Corruptions of Supervised Learning Problems: Typology and Mitigations [11.294508617469905]
We develop a general theory of corruption from an information-theoretic perspective.
We will focus here on changes in probability distributions.
This work sheds light on complexities arising from joint and dependent corruptions on both labels and attributes.
arXiv Detail & Related papers (2023-07-17T16:57:01Z) - Frequency-Based Vulnerability Analysis of Deep Learning Models against
Image Corruptions [48.34142457385199]
We present MUFIA, an algorithm designed to identify the specific types of corruptions that can cause models to fail.
We find that even state-of-the-art models trained to be robust against known common corruptions struggle against the low visibility-based corruptions crafted by MUFIA.
arXiv Detail & Related papers (2023-06-12T15:19:13Z) - FedForgery: Generalized Face Forgery Detection with Residual Federated
Learning [87.746829550726]
Existing face forgery detection methods directly utilize the obtained public shared or centralized data for training.
The paper proposes a novel generalized residual Federated learning for face Forgery detection (FedForgery)
Experiments conducted on publicly available face forgery detection datasets prove the superior performance of the proposed FedForgery.
arXiv Detail & Related papers (2022-10-18T03:32:18Z) - Soft Diffusion: Score Matching for General Corruptions [84.26037497404195]
We propose a new objective called Soft Score Matching that provably learns the score function for any linear corruption process.
We show that our objective learns the gradient of the likelihood under suitable regularity conditions for the family of corruption processes.
Our method achieves state-of-the-art FID score $1.85$ on CelebA-64, outperforming all previous linear diffusion models.
arXiv Detail & Related papers (2022-09-12T17:45:03Z) - Scrutinizing Shipment Records To Thwart Illegal Timber Trade [14.559268536152926]
grey and black market activities in the wood and forest products sector are not limited to the countries where the wood was harvested, but extend throughout the global supply chain.
Existing approaches suffer from certain shortcomings in their applicability towards large scale trade data.
We propose Contrastive Learning based Heterogeneous Anomaly Detection (CHAD) that is generally applicable for large-scale heterogeneous data.
arXiv Detail & Related papers (2022-07-31T18:54:52Z) - Using the Overlapping Score to Improve Corruption Benchmarks [6.445605125467574]
We propose a metric called corruption overlapping score, which can be used to reveal flaws in corruption benchmarks.
We argue that taking into account overlappings between corruptions can help to improve existing benchmarks or build better ones.
arXiv Detail & Related papers (2021-05-26T06:42:54Z) - Characterization of the Firm-Firm Public Procurement Co-Bidding Network
from the State of Cear\'a (Brazil) Municipalities [58.720142291102135]
We study the co-biding relationships between firms that participate in public tenders issued by the $184$ municipalities of the State of Cear'a (Brazil) between 2015 and 2019.
We identify $22$ groups/communities of firms with similar patterns of procurement activity, defined by their geographic and activity.
arXiv Detail & Related papers (2021-04-17T13:58:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.