Related papers: On the Diagnosis of Flaky Job Failures: Understanding and Prioritizing Failure Categories

On the Diagnosis of Flaky Job Failures: Understanding and Prioritizing Failure Categories

URL: http://arxiv.org/abs/2501.04976v1
Date: Thu, 09 Jan 2025 05:15:55 GMT
Title: On the Diagnosis of Flaky Job Failures: Understanding and Prioritizing Failure Categories
Authors: Henri Aïdasso, Francis Bordeleau, Ali Tizghadam,
Abstract summary: flaky job failures are one of the main issues hindering Continuous Deployment (CD)<n>This study examines 4,511 flaky job failures at TELUS to identify the different categories of flaky failures that we prioritize based on Recency, Frequency, and Monetary (RFM) measures.
Score: 2.8402080392117757
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The continuous delivery of modern software requires the execution of many automated pipeline jobs. These jobs ensure the frequent release of new software versions while detecting code problems at an early stage. For TELUS, our industrial partner in the telecommunications field, reliable job execution is crucial to minimize wasted time and streamline Continuous Deployment (CD). In this context, flaky job failures are one of the main issues hindering CD. Prior studies proposed techniques based on machine learning to automate the detection of flaky jobs. While valuable, these solutions are insufficient to address the waste associated with the diagnosis of flaky failures, which remain largely unexplored due to the wide range of underlying causes. This study examines 4,511 flaky job failures at TELUS to identify the different categories of flaky failures that we prioritize based on Recency, Frequency, and Monetary (RFM) measures. We identified 46 flaky failure categories that we analyzed using clustering and RFM measures to determine 14 priority categories for future automated diagnosis and repair research. Our findings also provide valuable insights into the evolution and impact of these categories. The identification and prioritization of flaky failure categories using RFM analysis introduce a novel approach that can be used in other contexts.

Related papers

Efficient Detection of Intermittent Job Failures Using Few-Shot Learning [2.8402080392117757]
We introduce a novel approach to intermittent job failure detection using few-shot learning.<n>Our approach achieves 70-88% F1-score with only 12 shots in all projects, outperforming the state-of-the-art (SOTA) approach.
arXiv Detail & Related papers (2025-07-05T22:04:01Z)
Foundation Models for Anomaly Detection: Vision and Challenges [19.2255593926904]
Foundation models (FMs) have emerged as a powerful tool for advancing anomaly detection. This survey presents the first comprehensive review of recent advancements in FM-based anomaly detection.
arXiv Detail & Related papers (2025-02-10T05:01:08Z)
A Transfer Learning Framework for Anomaly Detection in Multivariate IoT Traffic Data [6.229535970620059]
We propose a transfer learning-based model for anomaly detection in time-series datasets. Unlike conventional methods, our approach does not require labeled data in either the source or target domains. Empirical evaluations on novel intrusion detection datasets demonstrate that our model outperforms existing techniques.
arXiv Detail & Related papers (2025-01-26T02:03:49Z)
See it, Think it, Sorted: Large Multimodal Models are Few-shot Time Series Anomaly Analyzers [23.701716999879636]
Time series anomaly detection (TSAD) is becoming increasingly vital due to the rapid growth of time series data. We introduce a pioneering framework called the Time Series Anomaly Multimodal Analyzer (TAMA) to enhance both the detection and interpretation of anomalies.
arXiv Detail & Related papers (2024-11-04T10:28:41Z)
Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation [49.53202761595912]
Continual Test-Time Adaptation involves adapting a pre-trained source model to continually changing unsupervised target domains. We analyze the challenges of this task: online environment, unsupervised nature, and the risks of error accumulation and catastrophic forgetting. We propose an uncertainty-aware buffering approach to identify and aggregate significant samples with high certainty from the unsupervised, single-pass data stream.
arXiv Detail & Related papers (2024-07-12T15:48:40Z)
MELODY: Robust Semi-Supervised Hybrid Model for Entity-Level Online Anomaly Detection with Multivariate Time Series [11.754433499581879]
A faulty code change may degrade the target service's performance and cause cascading outages in downstream services. In this paper, we study the problem of anomaly detection for deployments. We propose a novel framework, semi-supervised hybrid Model for Entity-Level Online Detection of anomalY (MELODY)
arXiv Detail & Related papers (2024-01-18T19:02:41Z)
Progressing from Anomaly Detection to Automated Log Labeling and Pioneering Root Cause Analysis [53.24804865821692]
This study introduces a taxonomy for log anomalies and explores automated data labeling to mitigate labeling challenges. The study envisions a future where root cause analysis follows anomaly detection, unraveling the underlying triggers of anomalies.
arXiv Detail & Related papers (2023-12-22T15:04:20Z)
MFL Data Preprocessing and CNN-based Oil Pipeline Defects Detection [0.0]
Application of computer vision for anomaly detection has been under attention in several industrial fields. This work focuses on the research of the Magnetic Flux Leakage data and the preprocessing techniques. In doing so, we exploited the recent convolutional neural network structures and proposed robust approaches.
arXiv Detail & Related papers (2023-09-30T10:37:12Z)
Causal Disentanglement Hidden Markov Model for Fault Diagnosis [55.90917958154425]
We propose a Causal Disentanglement Hidden Markov model (CDHM) to learn the causality in the bearing fault mechanism. Specifically, we make full use of the time-series data and progressively disentangle the vibration signal into fault-relevant and fault-irrelevant factors. To expand the scope of the application, we adopt unsupervised domain adaptation to transfer the learned disentangled representations to other working environments.
arXiv Detail & Related papers (2023-08-06T05:58:45Z)
Shortcomings of Question Answering Based Factuality Frameworks for Error Localization [51.01957350348377]
We show that question answering (QA)-based factuality metrics fail to correctly identify error spans in generated summaries. Our analysis reveals a major reason for such poor localization: questions generated by the QG module often inherit errors from non-factual summaries which are then propagated further into downstream modules. Our experiments conclusively show that there exist fundamental issues with localization using the QA framework which cannot be fixed solely by stronger QA and QG models.
arXiv Detail & Related papers (2022-10-13T05:23:38Z)
A2Log: Attentive Augmented Log Anomaly Detection [53.06341151551106]
Anomaly detection becomes increasingly important for the dependability and serviceability of IT services. Existing unsupervised methods need anomaly examples to obtain a suitable decision boundary. We develop A2Log, which is an unsupervised anomaly detection method consisting of two steps: Anomaly scoring and anomaly decision.
arXiv Detail & Related papers (2021-09-20T13:40:21Z)
Anomaly Detection With Conditional Variational Autoencoders [1.3541554606406663]
We exploit the deep conditional variational autoencoder (CVAE) and we define an original loss function together with a metric that targets hierarchically structured data. Our motivating application is a real world problem: monitoring the trigger system which is a basic component of many particle physics experiments at the CERN Large Hadron Collider.
arXiv Detail & Related papers (2020-10-12T08:39:37Z)
TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks [73.01104041298031]
TadGAN is an unsupervised anomaly detection approach built on Generative Adversarial Networks (GANs) To capture the temporal correlations of time series, we use LSTM Recurrent Neural Networks as base models for Generators and Critics. To demonstrate the performance and generalizability of our approach, we test several anomaly scoring techniques and report the best-suited one.
arXiv Detail & Related papers (2020-09-16T15:52:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.