Related papers: Anomaly Detection in Double-entry Bookkeeping Data by Federated Learning System with Non-model Sharing Approach

Anomaly Detection in Double-entry Bookkeeping Data by Federated Learning System with Non-model Sharing Approach

URL: http://arxiv.org/abs/2501.12723v1
Date: Wed, 22 Jan 2025 08:53:12 GMT
Title: Anomaly Detection in Double-entry Bookkeeping Data by Federated Learning System with Non-model Sharing Approach
Authors: Sota Mashiko, Yuji Kawamata, Tomoru Nakayama, Tetsuya Sakurai, Yukihiko Okada,
Abstract summary: Anomaly detection is crucial in financial auditing and effective detection often requires obtaining large volumes of data from multiple organizations.<n>In this study, we propose a novel framework employing Data Collaboration (DC) analysis to streamline model training into a single communication round.<n>Our findings represent a significant advance in artificial intelligence-driven auditing and underscore the potential of FL methods in high-security domains.
Score: 3.827294988616478
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Anomaly detection is crucial in financial auditing and effective detection often requires obtaining large volumes of data from multiple organizations. However, confidentiality concerns hinder data sharing among audit firms. Although the federated learning (FL)-based approach, FedAvg, has been proposed to address this challenge, its use of mutiple communication rounds increases its overhead, limiting its practicality. In this study, we propose a novel framework employing Data Collaboration (DC) analysis -- a non-model share-type FL method -- to streamline model training into a single communication round. Our method first encodes journal entry data via dimensionality reduction to obtain secure intermediate representations, then transforms them into collaboration representations for building an autoencoder that detects anomalies. We evaluate our approach on a synthetic dataset and real journal entry data from multiple organizations. The results show that our method not only outperforms single-organization baselines but also exceeds FedAvg in non-i.i.d. experiments on real journal entry data that closely mirror real-world conditions. By preserving data confidentiality and reducing iterative communication, this study addresses a key auditing challenge -- ensuring data confidentiality while integrating knowledge from multiple audit firms. Our findings represent a significant advance in artificial intelligence-driven auditing and underscore the potential of FL methods in high-security domains.

Related papers

Privacy-Preserved Automated Scoring using Federated Learning for Educational Research [1.2556373621040728]
This study proposes a federated learning framework for automatic scoring in educational assessments. Student responses are processed locally on edge devices, and only optimized model parameters are shared with a central aggregation server. We evaluate our framework using assessment data from nine middle schools, comparing the accuracy of federated learning-based scoring models with traditionally trained centralized models.
arXiv Detail & Related papers (2025-03-12T19:06:25Z)
A Two-Stage Federated Learning Approach for Industrial Prognostics Using Large-Scale High-Dimensional Signals [1.2277343096128712]
Industrial prognostics aims to develop data-driven methods that leverage high-dimensional degradation signals from assets to predict their failure times. In practice, individual organizations often lack sufficient data to independently train reliable prognostic models. This article proposes a statistical learning-based federated model that enables multiple organizations to jointly train a prognostic model.
arXiv Detail & Related papers (2024-10-14T21:26:22Z)
Fin-Fed-OD: Federated Outlier Detection on Financial Tabular Data [11.027356898413139]
Anomaly detection in real-world scenarios poses challenges due to dynamic and often unknown anomaly distributions. This paper addresses the question of enhancing outlier detection within individual organizations without compromising data confidentiality. We propose a novel method leveraging representation learning and federated learning techniques to improve the detection of unknown anomalies.
arXiv Detail & Related papers (2024-04-23T11:22:04Z)
Exploring Federated Unlearning: Analysis, Comparison, and Insights [101.64910079905566]
federated unlearning enables the selective removal of data from models trained in federated systems.<n>This paper examines existing federated unlearning approaches, examining their algorithmic efficiency, impact on model accuracy, and effectiveness in preserving privacy.<n>We propose the OpenFederatedUnlearning framework, a unified benchmark for evaluating federated unlearning methods.
arXiv Detail & Related papers (2023-10-30T01:34:33Z)
Momentum Benefits Non-IID Federated Learning Simply and Provably [22.800862422479913]
Federated learning is a powerful paradigm for large-scale machine learning. FedAvg and SCAFFOLD are two prominent algorithms to address these challenges. This paper explores the utilization of momentum to enhance the performance of FedAvg and SCAFFOLD.
arXiv Detail & Related papers (2023-06-28T18:52:27Z)
Auditing and Generating Synthetic Data with Controllable Trust Trade-offs [54.262044436203965]
We introduce a holistic auditing framework that comprehensively evaluates synthetic datasets and AI models. It focuses on preventing bias and discrimination, ensures fidelity to the source data, assesses utility, robustness, and privacy preservation. We demonstrate the framework's effectiveness by auditing various generative models across diverse use cases.
arXiv Detail & Related papers (2023-04-21T09:03:18Z)
MAPS: A Noise-Robust Progressive Learning Approach for Source-Free Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation. This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z)
DRFLM: Distributionally Robust Federated Learning with Inter-client Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data. We propose a general framework to solve the above two challenges simultaneously. We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z)
Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
Unsupervised Domain Adaptive Learning via Synthetic Data for Person Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance. Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models. In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z)
Auto-weighted Robust Federated Learning with Corrupted Data Sources [7.475348174281237]
Federated learning provides a communication-efficient and privacy-preserving training process. Standard federated learning techniques that naively minimize an average loss function are vulnerable to data corruptions. We propose Auto-weighted Robust Federated Learning (arfl) to provide robustness against corrupted data sources.
arXiv Detail & Related papers (2021-01-14T21:54:55Z)
Privacy-preserving Traffic Flow Prediction: A Federated Learning Approach [61.64006416975458]
We propose a privacy-preserving machine learning technique named Federated Learning-based Gated Recurrent Unit neural network algorithm (FedGRU) for traffic flow prediction. FedGRU differs from current centralized learning methods and updates universal learning models through a secure parameter aggregation mechanism. It is shown that FedGRU's prediction accuracy is 90.96% higher than the advanced deep learning models.
arXiv Detail & Related papers (2020-03-19T13:07:49Z)
Stratified cross-validation for unbiased and privacy-preserving federated learning [0.0]
We focus on the recurrent problem of duplicated records that, if not handled properly, may cause over-optimistic estimations of a model's performances. We introduce and discuss stratified cross-validation, a validation methodology that leverages stratification techniques to prevent data leakage in federated learning settings.
arXiv Detail & Related papers (2020-01-22T15:49:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.