Related papers: FC-ADL: Efficient Microservice Anomaly Detection and Localisation Through Functional Connectivity

FC-ADL: Efficient Microservice Anomaly Detection and Localisation Through Functional Connectivity

URL: http://arxiv.org/abs/2512.00844v1
Date: Sun, 30 Nov 2025 11:29:30 GMT
Title: FC-ADL: Efficient Microservice Anomaly Detection and Localisation Through Functional Connectivity
Authors: Giles Winchester, George Parisis, Luc Berthouze,
Abstract summary: We propose FC-ADL, an end-to-end scalable approach for detecting and localising anomalous changes from microservice metrics.<n>We show that our approach can achieve top detection and localisation performance across a wide degree of different fault scenarios.<n>We demonstrate that our approach can achieve top detection and localisation performance across a wide degree of different fault scenarios when compared to state-of-the-art approaches.
Score: 2.994962964425238
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Microservices have transformed software architecture through the creation of modular and independent services. However, they introduce operational complexities in service integration and system management that makes swift and accurate anomaly detection and localisation challenging. Despite the complex, dynamic, and interconnected nature of microservice architectures, prior works that investigate metrics for anomaly detection rarely include explicit information about time-varying interdependencies. And whilst prior works on fault localisation typically do incorporate information about dependencies between microservices, they scale poorly to real world large-scale deployments due to their reliance on computationally expensive causal inference. To address these challenges we propose FC-ADL, an end-to-end scalable approach for detecting and localising anomalous changes from microservice metrics based on the neuroscientific concept of functional connectivity. We show that by efficiently characterising time-varying changes in dependencies between microservice metrics we can both detect anomalies and provide root cause candidates without incurring the significant overheads of causal and multivariate approaches. We demonstrate that our approach can achieve top detection and localisation performance across a wide degree of different fault scenarios when compared to state-of-the-art approaches. Furthermore, we illustrate the scalability of our approach by applying it to Alibaba's extremely large real-world microservice deployment.

Related papers

Why Does the LLM Stop Computing: An Empirical Study of User-Reported Failures in Open-Source LLMs [50.075587392477935]
We conduct the first large-scale empirical study of 705 real-world failures from the open-source DeepSeek, Llama, and Qwen ecosystems.<n>Our analysis reveals a paradigm shift: white-box orchestration relocates the reliability bottleneck from model algorithmic defects to the systemic fragility of the deployment stack.
arXiv Detail & Related papers (2026-01-20T06:42:56Z)
Hypothesize-Then-Verify: Speculative Root Cause Analysis for Microservices with Pathwise Parallelism [19.31110304702373]
SpecRCA is a speculative root cause analysis framework that adopts a textithypothesize-then-verify paradigm.<n>Preliminary experiments on the AIOps 2022 demonstrate that SpecRCA achieves superior accuracy and efficiency compared to existing approaches.
arXiv Detail & Related papers (2026-01-06T05:58:25Z)
Graph Neural AI with Temporal Dynamics for Comprehensive Anomaly Detection in Microservices [7.957284443727372]
This study addresses the problem of anomaly detection and root cause tracing in microservice architectures.<n>It proposes a unified framework that combines graph neural networks with temporal modeling.
arXiv Detail & Related papers (2025-11-05T08:28:41Z)
Registration is a Powerful Rotation-Invariance Learner for 3D Anomaly Detection [64.0168648353038]
3D anomaly detection in point-cloud data is critical for industrial quality control, aiming to identify structural defects with high reliability.<n>Current memory bank-based methods often suffer from inconsistent feature transformations and limited discriminative capacity.<n>We propose a registration-induced, rotation-invariant feature extraction framework that integrates the objectives of point-cloud registration and memory-based anomaly detection.
arXiv Detail & Related papers (2025-10-19T14:56:38Z)
Leveraging Network Methods for Hub-like Microservice Detection [48.55946052680251]
Hub-like microservice anti-pattern lacks unambiguous definition and detection method.<n>In this work, we aim to find a robust detection approach for the Hub-like microservice anti-pattern.
arXiv Detail & Related papers (2025-06-09T12:13:49Z)
Complexity at Scale: A Quantitative Analysis of an Alibaba Microservice Deployment [1.7124365853633132]
We analyse a microservice deployment dataset released by Alibaba.<n>We identify tens of thousands of characteristics that support an even broader array of front-end functionality.<n>We find that dependencies within the deployment at runtime can be different from the static view of the system.
arXiv Detail & Related papers (2025-04-17T17:50:44Z)
GAL-MAD: Towards Explainable Anomaly Detection in Microservice Applications Using Graph Attention Networks [1.0136215038345013]
Anomalies stemming from network and performance issues must be swiftly identified and addressed.<n>Existing anomaly detection techniques often rely on statistical models or machine learning methods.<n>We propose a novel anomaly detection model called Graph Attention and LSTM-based Microservice Anomaly Detection (GAL-MAD)
arXiv Detail & Related papers (2025-03-31T10:11:31Z)
PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a. Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns. We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z)
Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization. We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data. We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z)
Twin Graph-based Anomaly Detection via Attentive Multi-Modal Learning for Microservice System [24.2074235652359]
We propose MSTGAD, which seamlessly integrates all available data modalities via attentive multi-modal learning. We construct a transformer-based neural network with both spatial and temporal attention mechanisms to model the inter-correlations between different modalities. This enables us to detect anomalies automatically and accurately in real-time.
arXiv Detail & Related papers (2023-10-07T06:28:41Z)
Practical Anomaly Detection over Multivariate Monitoring Metrics for Online Services [29.37493773435177]
CMAnomaly is an anomaly detection framework on multivariate monitoring metrics based on collaborative machine. The proposed framework is extensively evaluated with both public data and industrial data collected from a large-scale online service system of Huawei Cloud. Compared with state-of-the-art baseline models, CMAnomaly achieves an average F1 score of 0.9494, outperforming baselines by 6.77% to 10.68%, and runs 10X to 20X faster.
arXiv Detail & Related papers (2023-08-19T08:08:05Z)
Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data. We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism. We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.