LEMMA-RCA: A Large Multi-modal Multi-domain Dataset for Root Cause Analysis
- URL: http://arxiv.org/abs/2406.05375v2
- Date: Thu, 26 Sep 2024 22:42:49 GMT
- Title: LEMMA-RCA: A Large Multi-modal Multi-domain Dataset for Root Cause Analysis
- Authors: Lecheng Zheng, Zhengzhang Chen, Dongjie Wang, Chengyuan Deng, Reon Matsuoka, Haifeng Chen,
- Abstract summary: Root cause analysis (RCA) is crucial for enhancing the reliability and performance of complex systems.
We introduce LEMMA-RCA, a large dataset designed for diverse RCA tasks across multiple domains and modalities.
We evaluate the quality of LEMMA-RCA by testing the performance of eight baseline methods on this dataset.
- Score: 32.816594249593955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Root cause analysis (RCA) is crucial for enhancing the reliability and performance of complex systems. However, progress in this field has been hindered by the lack of large-scale, open-source datasets tailored for RCA. To bridge this gap, we introduce LEMMA-RCA, a large dataset designed for diverse RCA tasks across multiple domains and modalities. LEMMA-RCA features various real-world fault scenarios from IT and OT operation systems, encompassing microservices, water distribution, and water treatment systems, with hundreds of system entities involved. We evaluate the quality of LEMMA-RCA by testing the performance of eight baseline methods on this dataset under various settings, including offline and online modes as well as single and multiple modalities. Our experimental results demonstrate the high quality of LEMMA-RCA. The dataset is publicly available at https://lemma-rca.github.io/.
Related papers
- Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation [2.549112678136113]
Retrieval-Augmented Generation (RAG) mitigates issues by integrating external dynamic information enhancing factual and updated grounding.
Cross-modal alignment and reasoning introduce unique challenges to Multimodal RAG, distinguishing it from traditional unimodal RAG.
This survey lays the foundation for developing more capable and reliable AI systems.
arXiv Detail & Related papers (2025-02-12T22:33:41Z) - Towards Efficient Large Multimodal Model Serving [19.388562622309838]
Large multi-modal models (LMMs) are capable of simultaneously processing inputs of various modalities such as text, images, video, and audio.
These models pose significant challenges due to their complex architectures and heterogeneous resource requirements.
We propose a decoupled serving architecture that enables independent resource allocation and adaptive scaling for each stage.
arXiv Detail & Related papers (2025-02-02T22:10:40Z) - RCAEval: A Benchmark for Root Cause Analysis of Microservice Systems with Telemetry Data [13.68949728404533]
Root cause analysis (RCA) for microservice systems has gained significant attention in recent years.
There is still no standard benchmark that includes large-scale datasets and supports comprehensive evaluation environments.
We introduce RCAEval, an open-source benchmark that provides datasets and an evaluation environment for RCA in microservice systems.
arXiv Detail & Related papers (2024-12-22T13:30:02Z) - Multi-modal Retrieval Augmented Multi-modal Generation: Datasets, Evaluation Metrics and Strong Baselines [64.61315565501681]
Multi-modal Retrieval Augmented Multi-modal Generation (M$2$RAG) is a novel task that enables foundation models to process multi-modal web content.
Despite its potential impact, M$2$RAG remains understudied, lacking comprehensive analysis and high-quality data resources.
arXiv Detail & Related papers (2024-11-25T13:20:19Z) - Online Multi-modal Root Cause Analysis [61.94987309148539]
Root Cause Analysis (RCA) is essential for pinpointing the root causes of failures in microservice systems.
Existing online RCA methods handle only single-modal data overlooking, complex interactions in multi-modal systems.
We introduce OCEAN, a novel online multi-modal causal structure learning method for root cause localization.
arXiv Detail & Related papers (2024-10-13T21:47:36Z) - Source-Free Collaborative Domain Adaptation via Multi-Perspective
Feature Enrichment for Functional MRI Analysis [55.03872260158717]
Resting-state MRI functional (rs-fMRI) is increasingly employed in multi-site research to aid neurological disorder analysis.
Many methods have been proposed to reduce fMRI heterogeneity between source and target domains.
But acquiring source data is challenging due to concerns and/or data storage burdens in multi-site studies.
We design a source-free collaborative domain adaptation framework for fMRI analysis, where only a pretrained source model and unlabeled target data are accessible.
arXiv Detail & Related papers (2023-08-24T01:30:18Z) - Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset
and Comprehensive Framework [51.44863255495668]
Multimodal reasoning is a critical component in the pursuit of artificial intelligence systems that exhibit human-like intelligence.
We present Multi-Modal Reasoning(COCO-MMR) dataset, a novel dataset that encompasses an extensive collection of open-ended questions.
We propose innovative techniques, including multi-hop cross-modal attention and sentence-level contrastive learning, to enhance the image and text encoders.
arXiv Detail & Related papers (2023-07-24T08:58:25Z) - Cross-Modal Fine-Tuning: Align then Refine [83.37294254884446]
ORCA is a cross-modal fine-tuning framework that extends the applicability of a single large-scale pretrained model to diverse modalities.
We show that ORCA obtains state-of-the-art results on 3 benchmarks containing over 60 datasets from 12 modalities.
arXiv Detail & Related papers (2023-02-11T16:32:28Z) - More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing
Imagery Classification [43.35966675372692]
We show how to train deep networks and build the network architecture.
In particular, we show different fusion strategies as well as how to train deep networks and build the network architecture.
Our framework is not only limited to pixel-wise classification tasks but also applicable to spatial information modeling with convolutional neural networks (CNNs)
arXiv Detail & Related papers (2020-08-12T17:45:25Z) - MS-Net: Multi-Site Network for Improving Prostate Segmentation with
Heterogeneous MRI Data [75.73881040581767]
We propose a novel multi-site network (MS-Net) for improving prostate segmentation by learning robust representations.
Our MS-Net improves the performance across all datasets consistently, and outperforms state-of-the-art methods for multi-site learning.
arXiv Detail & Related papers (2020-02-09T14:11:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.