DBAIOps: A Reasoning LLM-Enhanced Database Operation and Maintenance System using Knowledge Graphs
- URL: http://arxiv.org/abs/2508.01136v1
- Date: Sat, 02 Aug 2025 01:36:57 GMT
- Title: DBAIOps: A Reasoning LLM-Enhanced Database Operation and Maintenance System using Knowledge Graphs
- Authors: Wei Zhou, Peng Sun, Xuanhe Zhou, Qianglei Zang, Ji Xu, Tieying Zhang, Guoliang Li, Fan Wu,
- Abstract summary: Existing automatic database O&M methods, including commercial products, cannot effectively utilize expert experience.<n>We present DBAIOps, a novel hybrid database O&M system that combines reasonings with knowledge graphs to achieve DBA-style diagnosis.
- Score: 25.16965474653075
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The operation and maintenance (O&M) of database systems is critical to ensuring system availability and performance, typically requiring expert experience (e.g., identifying metric-to-anomaly relations) for effective diagnosis and recovery. However, existing automatic database O&M methods, including commercial products, cannot effectively utilize expert experience. On the one hand, rule-based methods only support basic O&M tasks (e.g., metric-based anomaly detection), which are mostly numerical equations and cannot effectively incorporate literal O&M experience (e.g., troubleshooting guidance in manuals). On the other hand, LLM-based methods, which retrieve fragmented information (e.g., standard documents + RAG), often generate inaccurate or generic results. To address these limitations, we present DBAIOps, a novel hybrid database O&M system that combines reasoning LLMs with knowledge graphs to achieve DBA-style diagnosis. First, DBAIOps introduces a heterogeneous graph model for representing the diagnosis experience, and proposes a semi-automatic graph construction algorithm to build that graph from thousands of documents. Second, DBAIOps develops a collection of (800+) reusable anomaly models that identify both directly alerted metrics and implicitly correlated experience and metrics. Third, for each anomaly, DBAIOps proposes a two-stage graph evolution mechanism to explore relevant diagnosis paths and identify missing relations automatically. It then leverages a reasoning LLM (e.g., DeepSeek-R1) to infer root causes and generate clear diagnosis reports for both DBAs and common users. Our evaluation over four mainstream database systems (Oracle, MySQL, PostgreSQL, and DM8) demonstrates that DBAIOps outperforms state-of-the-art baselines, 34.85% and 47.22% higher in root cause and human evaluation accuracy, respectively.
Related papers
- Leveraging Knowledge Graphs and LLM Reasoning to Identify Operational Bottlenecks for Warehouse Planning Assistance [1.2749527861829046]
Our framework integrates Knowledge Graphs (KGs) and Large Language Model (LLM)-based agents.<n>It transforms raw DES data into a semantically rich KG, capturing relationships between simulation events and entities.<n>An LLM-based agent uses iterative reasoning, generating interdependent sub-questions. For each sub-question, it creates Cypher queries for KG interaction, extracts information, and self-reflects to correct errors.
arXiv Detail & Related papers (2025-07-23T07:18:55Z) - Toward Understanding Bugs in Vector Database Management Systems [11.916195480211648]
Vector database management systems (VDBMSs) play a crucial role in facilitating semantic similarity searches over high-dimensional embeddings from diverse data sources.<n>Traditional database reliability models cannot be directly applied to VDBMSs because of fundamental differences in data representation, query mechanisms, and system architecture.<n>We manually analyzed 1,671 bug-fix pull requests from 15 widely used open-source VDBMSs and developed a comprehensive taxonomy of bugs based on symptoms, root causes, and developer fix strategies.
arXiv Detail & Related papers (2025-06-03T08:34:01Z) - Complex System Diagnostics Using a Knowledge Graph-Informed and Large Language Model-Enhanced Framework [0.0]
We present a novel diagnostic framework that integrates Knowledge Graphs (KGs) and Large Language Models (LLMs)<n>Our approach introduces a diagnostic framework grounded in the functional modeling principles of the Dynamic Master Logic (DML) model.<n>A case study on an auxiliary feedwater system demonstrated the framework's effectiveness, with over 90% accuracy in key elements and consistent tool and argument extraction.
arXiv Detail & Related papers (2025-05-27T14:54:49Z) - AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios [51.46347732659174]
Large Language Models (LLMs) have demonstrated advanced capabilities in real-world agentic applications.<n>AgentIF is the first benchmark for systematically evaluating LLM instruction following ability in agentic scenarios.
arXiv Detail & Related papers (2025-05-22T17:31:10Z) - DrugPilot: LLM-based Parameterized Reasoning Agent for Drug Discovery [54.79763887844838]
Large language models (LLMs) integrated with autonomous agents hold significant potential for advancing scientific discovery through automated reasoning and task execution.<n>We introduce DrugPilot, a LLM-based agent system with a parameterized reasoning architecture designed for end-to-end scientific in drug discovery.<n>DrugPilot significantly outperforms state-of-the-art agents such as ReAct and LoT, achieving task completion rates of 98.0%, 93.5%, and 64.0% for simple, multi-tool, and multi-turn scenarios, respectively.
arXiv Detail & Related papers (2025-05-20T05:18:15Z) - DiscoveryBench: Towards Data-Driven Discovery with Large Language Models [50.36636396660163]
We present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery.
Our benchmark contains 264 tasks collected across 6 diverse domains, such as sociology and engineering.
Our benchmark, thus, illustrates the challenges in autonomous data-driven discovery and serves as a valuable resource for the community to make progress.
arXiv Detail & Related papers (2024-07-01T18:58:22Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.<n>We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.<n>Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - D-Bot: Database Diagnosis System using Large Language Models [30.20192093986365]
Database administrators (DBAs) play an important role in managing, maintaining and optimizing database systems.
Recently large language models (LLMs) have shown great potential in various fields.
We propose D-Bot, an LLM-based database diagnosis system that can automatically acquire knowledge from diagnosis documents.
arXiv Detail & Related papers (2023-12-03T16:58:10Z) - A Deep Instance Generative Framework for MILP Solvers Under Limited Data
Availability [66.37474135424637]
We propose G2MILP, the first deep generative framework for MILP instances.
G2MILP represents MILP instances as bipartite graphs, and applies a masked variational autoencoder to iteratively corrupt and replace parts of the original graphs to generate new ones.
We design a suite of benchmarks to evaluate the quality of the generated MILP instances.
arXiv Detail & Related papers (2023-10-04T13:34:34Z) - LLM As DBA [25.92711955279298]
Large language models (LLMs) have shown great potential to understand valuable documents and generate reasonable answers.
This paper presents a revolutionary LLM-centric framework for database maintenance, including (i) database maintenance knowledge detection from documents and tools, (ii) tree of thought reasoning for root cause analysis, and (iii) collaborative diagnosis among multiple LLMs.
arXiv Detail & Related papers (2023-08-10T10:12:43Z) - Interpretable Medical Diagnostics with Structured Data Extraction by
Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports.
We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM.
We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.