Enhanced Federated Anomaly Detection Through Autoencoders Using Summary Statistics-Based Thresholding
- URL: http://arxiv.org/abs/2410.09284v1
- Date: Fri, 11 Oct 2024 22:21:14 GMT
- Title: Enhanced Federated Anomaly Detection Through Autoencoders Using Summary Statistics-Based Thresholding
- Authors: Sofiane Laridi, Gregory Palmer, Kam-Ming Mark Tam,
- Abstract summary: In Federated Learning (FL), anomaly detection is a challenging task due to the decentralized nature of data.
This study introduces a novel federated threshold calculation method that leverages summary statistics from both normal and anomalous data.
Our approach aggregates local summary statistics across clients to compute a global threshold that optimally separates anomalies from normal data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In Federated Learning (FL), anomaly detection (AD) is a challenging task due to the decentralized nature of data and the presence of non-IID data distributions. This study introduces a novel federated threshold calculation method that leverages summary statistics from both normal and anomalous data to improve the accuracy and robustness of anomaly detection using autoencoders (AE) in a federated setting. Our approach aggregates local summary statistics across clients to compute a global threshold that optimally separates anomalies from normal data while ensuring privacy preservation. We conducted extensive experiments using publicly available datasets, including Credit Card Fraud Detection, Shuttle, and Covertype, under various data distribution scenarios. The results demonstrate that our method consistently outperforms existing federated and local threshold calculation techniques, particularly in handling non-IID data distributions. This study also explores the impact of different data distribution scenarios and the number of clients on the performance of federated anomaly detection. Our findings highlight the potential of using summary statistics for threshold calculation in improving the scalability and accuracy of federated anomaly detection systems.
Related papers
- FedAD-Bench: A Unified Benchmark for Federated Unsupervised Anomaly Detection in Tabular Data [11.42231457116486]
FedAD-Bench is a benchmark for evaluating unsupervised anomaly detection algorithms within the context of federated learning.
We identify key challenges such as model aggregation inefficiencies and metric unreliability.
Our work aims to establish a standardized benchmark to guide future research and development in federated anomaly detection.
arXiv Detail & Related papers (2024-08-08T13:14:19Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - Anomaly Detection with Score Distribution Discrimination [4.468952886990851]
We propose to optimize the anomaly scoring function from the view of score distribution.
We design a novel loss function called Overlap loss that minimizes the overlap area between the score distributions of normal and abnormal samples.
arXiv Detail & Related papers (2023-06-26T03:32:57Z) - CADIS: Handling Cluster-skewed Non-IID Data in Federated Learning with
Clustered Aggregation and Knowledge DIStilled Regularization [3.3711670942444014]
Federated learning enables edge devices to train a global model collaboratively without exposing their data.
We tackle a new type of Non-IID data, called cluster-skewed non-IID, discovered in actual data sets.
We propose an aggregation scheme that guarantees equality between clusters.
arXiv Detail & Related papers (2023-02-21T02:53:37Z) - PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows.
Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z) - Federated Anomaly Detection over Distributed Data Streams [0.0]
We propose an approach to building the bridge among anomaly detection, federated learning, and data streams.
The overarching goal of the work is to detect anomalies in a federated environment over distributed data streams.
arXiv Detail & Related papers (2022-05-16T17:38:58Z) - Data-SUITE: Data-centric identification of in-distribution incongruous
examples [81.21462458089142]
Data-SUITE is a data-centric framework to identify incongruous regions of in-distribution (ID) data.
We empirically validate Data-SUITE's performance and coverage guarantees.
arXiv Detail & Related papers (2022-02-17T18:58:31Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - Am I Rare? An Intelligent Summarization Approach for Identifying Hidden
Anomalies [0.0]
In this paper, we propose an INtelligent Summarization approach for IDENTifying hidden anomalies, called INSIDENT.
Our approach is a clustering-based algorithm that dynamically maps original feature space to a new feature space by locally weighting features in each cluster. Besides, selecting representatives based on cluster size keeps the same distribution as the original data in summarized data.
arXiv Detail & Related papers (2020-12-24T23:22:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.