A Pipeline for Business Intelligence and Data-Driven Root Cause Analysis
on Categorical Data
- URL: http://arxiv.org/abs/2211.06717v1
- Date: Sat, 12 Nov 2022 18:12:10 GMT
- Title: A Pipeline for Business Intelligence and Data-Driven Root Cause Analysis
on Categorical Data
- Authors: Shubham Thakar, Dhananjay Kalbande
- Abstract summary: This paper proposes a new clustering + association rule mining pipeline for getting business insights from data.
The occurrence of any event is explained by its antecedents in the generated rules.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Business intelligence (BI) is any knowledge derived from existing data that
may be strategically applied within a business. Data mining is a technique or
method for extracting BI from data using statistical data modeling. Finding
relationships or correlations between the various data items that have been
collected can be used to boost business performance or at the very least better
comprehend what is going on. Root cause analysis (RCA) is discovering the root
causes of problems or events to identify appropriate solutions. RCA can show
why an event occurred and this can help in avoiding occurrences of an issue in
the future. This paper proposes a new clustering + association rule mining
pipeline for getting business insights from data. The results of this pipeline
are in the form of association rules having consequents, antecedents, and
various metrics to evaluate these rules. The results of this pipeline can help
in anchoring important business decisions and can also be used by data
scientists for updating existing models or while developing new ones. The
occurrence of any event is explained by its antecedents in the generated rules.
Hence this output can also help in data-driven root cause analysis.
Related papers
- New Rules for Causal Identification with Background Knowledge [59.733125324672656]
We propose two novel rules for incorporating BK, which offer a new perspective to the open problem.
We show that these rules are applicable in some typical causality tasks, such as determining the set of possible causal effects with observational data.
arXiv Detail & Related papers (2024-07-21T20:21:21Z) - Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - PyRCA: A Library for Metric-based Root Cause Analysis [66.72542200701807]
PyRCA is an open-source machine learning library of Root Cause Analysis (RCA) for Artificial Intelligence for IT Operations (AIOps)
It provides a holistic framework to uncover the complicated metric causal dependencies and automatically locate root causes of incidents.
arXiv Detail & Related papers (2023-06-20T09:55:10Z) - $\texttt{causalAssembly}$: Generating Realistic Production Data for
Benchmarking Causal Discovery [1.3048920509133808]
We build a system for generation of semisynthetic manufacturing data that supports benchmarking of causal discovery methods.
We employ distributional random forests to flexibly estimate and represent conditional distributions.
Using the library, we showcase how to benchmark several well-known causal discovery algorithms.
arXiv Detail & Related papers (2023-06-19T10:05:54Z) - Mining Root Cause Knowledge from Cloud Service Incident Investigations
for AIOps [71.12026848664753]
Root Cause Analysis (RCA) of any service-disrupting incident is one of the most critical as well as complex tasks in IT processes.
In this work, we present ICA and the downstream Incident Search and Retrieval based RCA pipeline, built at Salesforce.
arXiv Detail & Related papers (2022-04-21T02:33:34Z) - Federated Causal Discovery [74.37739054932733]
This paper develops a gradient-based learning framework named DAG-Shared Federated Causal Discovery (DS-FCD)
It can learn the causal graph without directly touching local data and naturally handle the data heterogeneity.
Extensive experiments on both synthetic and real-world datasets verify the efficacy of the proposed method.
arXiv Detail & Related papers (2021-12-07T08:04:12Z) - Bayesian Model Averaging for Data Driven Decision Making when Causality
is Partially Known [0.0]
We use ensemble methods like Bayesian Model Averaging (BMA) to infer set of causal graphs.
We provide decisions by computing the expected value and risk of potential interventions explicitly.
arXiv Detail & Related papers (2021-05-12T01:55:45Z) - PROVED: A Tool for Graph Representation and Analysis of Uncertain Event
Data [0.966840768820136]
The discipline of process mining aims to study processes in a data-driven manner by analyzing historical process executions.
Recent novel types of event data have gathered interest among the process mining community, including uncertain event data.
The PROVED tool helps to explore, navigate and analyze such uncertain event data.
arXiv Detail & Related papers (2021-03-09T17:11:54Z) - Back to Prior Knowledge: Joint Event Causality Extraction via
Convolutional Semantic Infusion [5.566928318239452]
Joint event and causality extraction is a challenging yet essential task in information retrieval and data mining.
We propose convolutional knowledge infusion for frequent n-grams with different windows of length within a joint extraction framework.
Our model significantly outperforms the strong BERT+CSNN baseline.
arXiv Detail & Related papers (2021-02-19T13:31:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.