Applied Machine Learning to Anomaly Detection in Enterprise Purchase Processes
- URL: http://arxiv.org/abs/2405.14754v1
- Date: Thu, 23 May 2024 16:21:51 GMT
- Title: Applied Machine Learning to Anomaly Detection in Enterprise Purchase Processes
- Authors: A. Herreros-Martínez, R. Magdalena-Benedicto, J. Vila-Francés, A. J. Serrano-López, S. Pérez-Díaz,
- Abstract summary: This work proposes a methodology to prioritise the investigation of the cases detected in two large purchase datasets from real data.
The goal is to contribute to the effectiveness of the companies' control efforts and to increase the performance of carrying out such tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In a context of a continuous digitalisation of processes, organisations must deal with the challenge of detecting anomalies that can reveal suspicious activities upon an increasing volume of data. To pursue this goal, audit engagements are carried out regularly, and internal auditors and purchase specialists are constantly looking for new methods to automate these processes. This work proposes a methodology to prioritise the investigation of the cases detected in two large purchase datasets from real data. The goal is to contribute to the effectiveness of the companies' control efforts and to increase the performance of carrying out such tasks. A comprehensive Exploratory Data Analysis is carried out before using unsupervised Machine Learning techniques addressed to detect anomalies. A univariate approach has been applied through the z-Score index and the DBSCAN algorithm, while a multivariate analysis is implemented with the k-Means and Isolation Forest algorithms, and the Silhouette index, resulting in each method having a transaction candidates' proposal to be reviewed. An ensemble prioritisation of the candidates is provided jointly with a proposal of explicability methods (LIME, Shapley, SHAP) to help the company specialists in their understanding.
Related papers
- ORMind: A Cognitive-Inspired End-to-End Reasoning Framework for Operations Research [53.736407871322314]
We introduce ORMind, a cognitive-inspired framework that enhances optimization through counterfactual reasoning.<n>Our approach emulates human cognition, implementing an end-to-end workflow that transforms requirements into mathematical models and executable code.<n>It is currently being tested internally in Lenovo's AI Assistant, with plans to enhance optimization capabilities for both business and consumer customers.
arXiv Detail & Related papers (2025-06-02T05:11:21Z) - Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs [58.24692529185971]
We introduce a comprehensive auditing framework for unlearning evaluation comprising three benchmark datasets, six unlearning algorithms, and five prompt-based auditing methods.<n>We evaluate the effectiveness and robustness of different unlearning strategies.
arXiv Detail & Related papers (2025-05-29T09:19:07Z) - On the Interconnections of Calibration, Quantification, and Classifier Accuracy Prediction under Dataset Shift [58.91436551466064]
This paper investigates the interconnections among three fundamental problems, calibration, and quantification, under dataset shift conditions.<n>We show that access to an oracle for any one of these tasks enables the resolution of the other two.<n>We propose new methods for each problem based on direct adaptations of well-established methods borrowed from the other disciplines.
arXiv Detail & Related papers (2025-05-16T15:42:55Z) - Adaptive Bounded Exploration and Intermediate Actions for Data Debiasing [18.87576995391638]
We propose algorithms for sequentially debiasing the training dataset through adaptive and bounded exploration.
Our proposed algorithms balance between the ultimate goal of mitigating the impacts of data biases -- which will in turn lead to more accurate and fairer decisions.
arXiv Detail & Related papers (2025-04-10T22:22:23Z) - Anomaly Detection in Double-entry Bookkeeping Data by Federated Learning System with Non-model Sharing Approach [3.827294988616478]
Anomaly detection is crucial in financial auditing and effective detection often requires obtaining large volumes of data from multiple organizations.
In this study, we propose a novel framework employing Data Collaboration (DC) analysis to streamline model training into a single communication round.
Our findings represent a significant advance in artificial intelligence-driven auditing and underscore the potential of FL methods in high-security domains.
arXiv Detail & Related papers (2025-01-22T08:53:12Z) - Epoch-based Application of Problem-Aware Operators in a Multiobjective Memetic Algorithm for Portfolio Optimization [0.0]
We consider the issue of intensification/diversification balance in the context of a memetic algorithm for the multiobjective optimization of investment portfolios with cardinality constraints.
We have conducted a sensibility analysis to determine in which phases of the search the application of these operators leads to better results.
Our findings indicate that the resulting algorithm is quite robust in terms of parameterization from the point of view of this problem-specific indicator.
arXiv Detail & Related papers (2024-12-05T08:57:42Z) - WISE: Unraveling Business Process Metrics with Domain Knowledge [0.0]
Anomalies in complex industrial processes are often obscured by high variability and complexity of event data.
We introduce WISE, a novel method for analyzing business process metrics through the integration of domain knowledge, process mining, and machine learning.
We show that WISE enhances automation in business process analysis and effectively detects deviations from desired process flows.
arXiv Detail & Related papers (2024-10-06T07:57:08Z) - How Industry Tackles Anomalies during Runtime: Approaches and Key Monitoring Parameters [4.041882008624403]
This paper seeks to comprehend anomalies and current anomaly detection approaches across diverse industrial sectors.
It also aims to pinpoint the parameters necessary for identifying anomalies via runtime monitoring data.
arXiv Detail & Related papers (2024-08-14T21:10:15Z) - Anomaly Detection via Learning-Based Sequential Controlled Sensing [25.282033825977827]
We address the problem of detecting anomalies among a set of binary processes via learning-based controlled sensing.
To identify the anomalies, the decision-making agent is allowed to observe a subset of the processes at each time instant.
Our objective is to design a sequential selection policy that dynamically determines which processes to observe at each time.
arXiv Detail & Related papers (2023-11-30T07:49:33Z) - Weakly Supervised Anomaly Detection: A Survey [75.26180038443462]
Anomaly detection (AD) is a crucial task in machine learning with various applications.
We present the first comprehensive survey of weakly supervised anomaly detection (WSAD) methods.
For each setting, we provide formal definitions, key algorithms, and potential future directions.
arXiv Detail & Related papers (2023-02-09T10:27:21Z) - DRFLM: Distributionally Robust Federated Learning with Inter-client
Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data.
We propose a general framework to solve the above two challenges simultaneously.
We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z) - A2Log: Attentive Augmented Log Anomaly Detection [53.06341151551106]
Anomaly detection becomes increasingly important for the dependability and serviceability of IT services.
Existing unsupervised methods need anomaly examples to obtain a suitable decision boundary.
We develop A2Log, which is an unsupervised anomaly detection method consisting of two steps: Anomaly scoring and anomaly decision.
arXiv Detail & Related papers (2021-09-20T13:40:21Z) - MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems.
We propose a novel method for computing the normalized maximum likelihood (NML) distribution.
We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z) - Scaling up Search Engine Audits: Practical Insights for Algorithm
Auditing [68.8204255655161]
We set up experiments for eight search engines with hundreds of virtual agents placed in different regions.
We demonstrate the successful performance of our research infrastructure across multiple data collections.
We conclude that virtual agents are a promising venue for monitoring the performance of algorithms across long periods of time.
arXiv Detail & Related papers (2021-06-10T15:49:58Z) - Approaches to Fraud Detection on Credit Card Transactions Using
Artificial Intelligence Methods [0.0]
This paper summarizes state-of-the-art approaches to fraud detection using artificial intelligence and machine learning techniques.
While summarizing, we will categorize the common problems such as imbalanced dataset, real time working scenarios, and feature engineering challenges.
arXiv Detail & Related papers (2020-07-29T06:18:57Z) - Sequential Transfer in Reinforcement Learning with a Generative Model [48.40219742217783]
We show how to reduce the sample complexity for learning new tasks by transferring knowledge from previously-solved ones.
We derive PAC bounds on its sample complexity which clearly demonstrate the benefits of using this kind of prior knowledge.
We empirically verify our theoretical findings in simple simulated domains.
arXiv Detail & Related papers (2020-07-01T19:53:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.