Revelio: ML-Generated Debugging Queries for Distributed Systems
- URL: http://arxiv.org/abs/2106.14347v1
- Date: Mon, 28 Jun 2021 00:23:21 GMT
- Title: Revelio: ML-Generated Debugging Queries for Distributed Systems
- Authors: Pradeep Dogga (1), Karthik Narasimhan (2), Anirudh Sivaraman (3), Shiv
Kumar Saini (4), George Varghese (1), Ravi Netravali (2) ((1) UCLA, (2)
Princeton University, (3) NYU, (4) Adobe Research, India)
- Abstract summary: Revelio takes user reports and system logs as input, and outputs queries that developers can use to find a bug's root cause.
It employs deep neural networks to uniformly embed diverse input sources and potential queries into a high-dimensional vector space.
We show that Revelio includes the most helpful query in its predicted list of top-3 relevant queries 96% of the time.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A major difficulty in debugging distributed systems lies in manually
determining which of the many available debugging tools to use and how to query
its logs. Our own study of a production debugging workflow confirms the
magnitude of this burden. This paper explores whether a machine-learning model
can assist developers in distributed systems debugging. We present Revelio, a
debugging assistant which takes user reports and system logs as input, and
outputs debugging queries that developers can use to find a bug's root cause.
The key challenges lie in (1) combining inputs of different types (e.g.,
natural language reports and quantitative logs) and (2) generalizing to unseen
faults. Revelio addresses these by employing deep neural networks to uniformly
embed diverse input sources and potential queries into a high-dimensional
vector space. In addition, it exploits observations from production systems to
factorize query generation into two computationally and statistically simpler
learning tasks. To evaluate Revelio, we built a testbed with multiple
distributed applications and debugging tools. By injecting faults and training
on logs and reports from 800 Mechanical Turkers, we show that Revelio includes
the most helpful query in its predicted list of top-3 relevant queries 96% of
the time. Our developer study confirms the utility of Revelio.
Related papers
- LLPut: Investigating Large Language Models for Bug Report-Based Input Generation [0.0]
Failure-inducing inputs play a crucial role in diagnosing and analyzing software bugs.
Prior research has leveraged various Natural Language Processing (NLP) techniques for automated input extraction.
With the advent of Large Language Models (LLMs), an important research question arises: how effectively can generative LLMs extract failure-inducing inputs from bug reports?
arXiv Detail & Related papers (2025-03-26T14:25:01Z) - A Systematic Survey on Debugging Techniques for Machine Learning Systems [5.747738795689893]
Machine learning (ML) software poses unique challenges compared to traditional software.
Various methods have been proposed for testing, diagnosing, and repairing ML systems.
However, the big picture informing important research directions that fulfill developers needs is yet to unfold.
arXiv Detail & Related papers (2025-03-05T03:57:20Z) - KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks.
In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel.
To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z) - RAGLog: Log Anomaly Detection using Retrieval Augmented Generation [0.0]
We explore the use of a Retrieval Augmented Large Language Model that leverages a vector database to detect anomalies from logs.
To the best of our knowledge, our experiment which we called RAGLog is a novel one and the experimental results show much promise.
arXiv Detail & Related papers (2023-11-09T10:40:04Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - Leveraging Log Instructions in Log-based Anomaly Detection [0.5949779668853554]
We propose a method for reliable and practical anomaly detection from system logs.
It overcomes the common disadvantage of related works by building an anomaly detection model with log instructions from the source code of 1000+ GitHub projects.
The proposed method, named ADLILog, combines the log instructions and the data from the system of interest (target system) to learn a deep neural network model.
arXiv Detail & Related papers (2022-07-07T10:22:10Z) - DapStep: Deep Assignee Prediction for Stack Trace Error rePresentation [61.99379022383108]
We propose new deep learning models to solve the bug triage problem.
The models are based on a bidirectional recurrent neural network with attention and on a convolutional neural network.
To improve the quality of ranking, we propose using additional information from version control system annotations.
arXiv Detail & Related papers (2022-01-14T00:16:57Z) - Discovering and Validating AI Errors With Crowdsourced Failure Reports [10.4818618376202]
We introduce crowdsourced failure reports, end-user descriptions of how or why a model failed, and show how developers can use them to detect AI errors.
We also design and implement Deblinder, a visual analytics system for synthesizing failure reports.
In semi-structured interviews and think-aloud studies with 10 AI practitioners, we explore the affordances of the Deblinder system and the applicability of failure reports in real-world settings.
arXiv Detail & Related papers (2021-09-23T23:26:59Z) - S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning.
It is based on a biLSTM encoder and a fully-connected classifier to compute similarity.
Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z) - Robust and Transferable Anomaly Detection in Log Data using Pre-Trained
Language Models [59.04636530383049]
Anomalies or failures in large computer systems, such as the cloud, have an impact on a large number of users.
We propose a framework for anomaly detection in log data, as a major troubleshooting source of system information.
arXiv Detail & Related papers (2021-02-23T09:17:05Z) - Self-Supervised Log Parsing [59.04636530383049]
Large-scale software systems generate massive volumes of semi-structured log records.
Existing approaches rely on log-specifics or manual rule extraction.
We propose NuLog that utilizes a self-supervised learning model and formulates the parsing task as masked language modeling.
arXiv Detail & Related papers (2020-03-17T19:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.