Related papers: On the Fly Detection of Root Causes from Observed Data with Application to IT Systems

On the Fly Detection of Root Causes from Observed Data with Application to IT Systems

URL: http://arxiv.org/abs/2402.06500v1
Date: Fri, 9 Feb 2024 16:10:19 GMT
Title: On the Fly Detection of Root Causes from Observed Data with Application to IT Systems
Authors: Lei Zan, Charles K. Assaad, Emilie Devijver, Eric Gaussier
Abstract summary: This paper introduces a new structural causal model tailored for representing threshold-based IT systems. It presents a new algorithm designed to rapidly detect root causes of anomalies in such systems.
Score: 3.400056739248712
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces a new structural causal model tailored for representing threshold-based IT systems and presents a new algorithm designed to rapidly detect root causes of anomalies in such systems. When root causes are not causally related, the method is proven to be correct; while an extension is proposed based on the intervention of an agent to relax this assumption. Our algorithm and its agent-based extension leverage causal discovery from offline data and engage in subgraph traversal when encountering new anomalies in online data. Our extensive experiments demonstrate the superior performance of our methods, even when applied to data generated from alternative structural causal models or real IT monitoring data.

Related papers

RADICE: Causal Graph Based Root Cause Analysis for System Performance Diagnostic [3.708415881042821]
Root cause analysis is one of the most crucial operations in software reliability regarding system performance diagnostic. We present a novel causal domain knowledge model representing causal relations about the underlying system components. We then introduce RADICE, an algorithm that through the causal graph discovery, enhancement, refinement, and subtraction processes is able to output a root cause causal sub-graph.
arXiv Detail & Related papers (2025-01-20T15:36:39Z)
Online Multi-modal Root Cause Analysis [61.94987309148539]
Root Cause Analysis (RCA) is essential for pinpointing the root causes of failures in microservice systems. Existing online RCA methods handle only single-modal data overlooking, complex interactions in multi-modal systems. We introduce OCEAN, a novel online multi-modal causal structure learning method for root cause localization.
arXiv Detail & Related papers (2024-10-13T21:47:36Z)
CAnDOIT: Causal Discovery with Observational and Interventional Data from Time-Series [4.008958683836471]
CAnDOIT is a causal discovery method to reconstruct causal models using both observational and interventional data. The use of interventional data in the causal analysis is crucial for real-world applications, such as robotics. A Python implementation of CAnDOIT has also been developed and is publicly available on GitHub.
arXiv Detail & Related papers (2024-10-03T13:57:08Z)
Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization. We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data. We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z)
Disentangled Causal Graph Learning for Online Unsupervised Root Cause Analysis [49.910053255238566]
Root cause analysis (RCA) can identify the root causes of system faults/failures by analyzing system monitoring data. Previous research has mostly focused on developing offline RCA algorithms, which often require manually initiating the RCA process. We propose CORAL, a novel online RCA framework that can automatically trigger the RCA process and incrementally update the RCA model.
arXiv Detail & Related papers (2023-05-18T01:27:48Z)
CNTS: Cooperative Network for Time Series [7.356583983200323]
This paper presents a novel approach for unsupervised anomaly detection called the Cooperative Network Time Series approach. The central aspect of CNTS is a multi-objective optimization problem, which is solved through a cooperative solution strategy. Experiments on three real-world datasets demonstrate the state-of-the-art performance of CNTS and confirm the cooperative effectiveness of the detector and reconstructor.
arXiv Detail & Related papers (2023-02-20T06:55:10Z)
Hierarchical Graph Neural Networks for Causal Discovery and Root Cause Localization [52.72490784720227]
REASON consists of Topological Causal Discovery and Individual Causal Discovery. The Topological Causal Discovery component aims to model the fault propagation in order to trace back to the root causes. The Individual Causal Discovery component focuses on capturing abrupt change patterns of a single system entity.
arXiv Detail & Related papers (2023-02-03T20:17:45Z)
Causality-Based Multivariate Time Series Anomaly Detection [63.799474860969156]
We formulate the anomaly detection problem from a causal perspective and view anomalies as instances that do not follow the regular causal mechanism to generate the multivariate data. We then propose a causality-based anomaly detection approach, which first learns the causal structure from data and then infers whether an instance is an anomaly relative to the local causal mechanism. We evaluate our approach with both simulated and public datasets as well as a case study on real-world AIOps applications.
arXiv Detail & Related papers (2022-06-30T06:00:13Z)
Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process. Our method significantly reduces the required number of interactions compared with random intervention targeting. We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z)
Identifying Causal Structure in Dynamical Systems [6.451261098085498]
We propose a method that identifies the causal structure of control systems. Experiments on a robot arm demonstrate reliable causal identification from real-world data.
arXiv Detail & Related papers (2020-06-06T16:17:07Z)
Causal Discovery from Incomplete Data: A Deep Learning Approach [21.289342482087267]
Imputated Causal Learning is proposed to perform iterative missing data imputation and causal structure discovery. We show that ICL can outperform state-of-the-art methods under different missing data mechanisms.
arXiv Detail & Related papers (2020-01-15T14:28:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.