Related papers: Feature Engineering for Scalable Application-Level Post-Silicon Debugging

Feature Engineering for Scalable Application-Level Post-Silicon Debugging

URL: http://arxiv.org/abs/2102.04554v1
Date: Mon, 8 Feb 2021 22:11:59 GMT
Title: Feature Engineering for Scalable Application-Level Post-Silicon Debugging
Authors: Debjit Pal, Shobha Vasudevan
Abstract summary: We present solutions for both observability enhancement and root-cause diagnosis of post-silicon System-on-Chips (SoCs) validation. We model specification of interacting flows in typical applications for message selection. We define diagnosis problem as identifying buggy traces as outliers and bug-free traces as inliers/normal behaviors.
Score: 0.456877715768796
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present systematic and efficient solutions for both observability enhancement and root-cause diagnosis of post-silicon System-on-Chips (SoCs) validation with diverse usage scenarios. We model specification of interacting flows in typical applications for message selection. Our method for message selection optimizes flow specification coverage and trace buffer utilization. We define the diagnosis problem as identifying buggy traces as outliers and bug-free traces as inliers/normal behaviors, for which we use unsupervised learning algorithms for outlier detection. Instead of direct application of machine learning algorithms over trace data using the signals as raw features, we use feature engineering to transform raw features into more sophisticated features using domain specific operations. The engineered features are highly relevant to the diagnosis task and are generic to be applied across any hardware designs. We present debugging and root cause analysis of subtle post-silicon bugs in industry-scale OpenSPARC T2 SoC. We achieve a trace buffer utilization of 98.96\% with a flow specification coverage of 94.3\% (average). Our diagnosis method was able to diagnose up to 66.7\% more bugs and took up to 847$\times$ less diagnosis time as compared to the manual debugging with a diagnosis precision of 0.769.

Related papers

Explainability for Fault Detection System in Chemical Processes [0.0]
We apply and compare two state-of-the-art Artificial Intelligence (XAI) methods, that explain the fault diagnosis decisions of a highly accurate Long Short-Time Memory (LSTM)<n>It is highlighted how XAI methods can help identify the subsystem of the process where the fault occurred.<n>The proposed approach is not limited to the specific process and can also be used in similar problems.
arXiv Detail & Related papers (2026-02-18T10:26:12Z)
Predicting Intermittent Job Failure Categories for Diagnosis Using Few-Shot Fine-Tuned Language Models [1.2744523252873348]
FlaXifyer is a few-shot learning approach for predicting intermittent job failure categories using pre-trained language models.<n>LogSift is an interpretability technique that identifies influential log statements in under one second.<n> Evaluation on 2,458 job failures from TELUS demonstrates that FlaXifyer and LogSift enable effective automated triage, accelerate failure diagnosis, and pave the way towards the automated resolution of intermittent job failures.
arXiv Detail & Related papers (2026-01-29T19:34:34Z)
Diagnosing Violations of State-based Specifications in iCFTL [1.1059590443280727]
We propose a diagnostic approach based on backward data-flow analysis to determine the relevant statements contributing to a specification violation.<n>We implement our approach in a prototype tool, iCFTL-Diagnostics, and evaluate it on 112 specifications across 10 software projects.<n>Our tool achieves 90% precision in identifying relevant statements for 100 of the 112 specifications.
arXiv Detail & Related papers (2025-09-22T13:40:02Z)
Fault detection and diagnosis for the engine electrical system of a space launcher based on a temporal convolutional autoencoder and calibrated classifiers [0.0]
This paper outlines a first step toward developing an onboard fault detection and diagnostic capability for the next generation of reusable space launchers.<n>Unlike existing approaches in the literature, our solution is designed to meet a broader range of key requirements.<n>The proposed solution is based on a temporal convolutional autoencoder to automatically extract low-dimensional features from raw sensor data.
arXiv Detail & Related papers (2025-07-17T11:50:29Z)
A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy. We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods. By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z)
PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows. Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z)
Deep Scattering Spectrum germaneness to Fault Detection and Diagnosis for Component-level Prognostics and Health Management (PHM) [0.0]
This work focuses on the study of the Deep Scattering Spectrum (DSS)'s relevance to fault detection and daignosis for mechanical components of industrail robots. We used multiple industrial robots and distinct mechanical faults to build an approach for classifying the faults. The presented approach was implemented on the practical test benches and demonstrated satisfactory performance in fault detection and diagnosis.
arXiv Detail & Related papers (2022-10-18T13:25:02Z)
SensorSCAN: Self-Supervised Learning and Deep Clustering for Fault Diagnosis in Chemical Processes [2.398451252047814]
We propose SensorSCAN, a novel method for unsupervised fault detection and diagnosis. We demonstrate our model's performance on two publicly available datasets of the Tennessee Eastman Process with various faults. Our method is suitable for real-world applications where the number of faults is not known in advance.
arXiv Detail & Related papers (2022-08-17T10:24:37Z)
Leveraging Log Instructions in Log-based Anomaly Detection [0.5949779668853554]
We propose a method for reliable and practical anomaly detection from system logs. It overcomes the common disadvantage of related works by building an anomaly detection model with log instructions from the source code of 1000+ GitHub projects. The proposed method, named ADLILog, combines the log instructions and the data from the system of interest (target system) to learn a deep neural network model.
arXiv Detail & Related papers (2022-07-07T10:22:10Z)
Sintel: A Machine Learning Framework to Extract Insights from Signals [13.04826679898367]
We introduce Sintel, a machine learning framework for end-to-end time series tasks such as anomaly detection. Sintel logs the entire anomaly detection journey, providing detailed documentation of anomalies over time. It enables users to analyze signals, compare methods, and investigate anomalies through an interactive visualization tool.
arXiv Detail & Related papers (2022-04-19T19:38:27Z)
A2Log: Attentive Augmented Log Anomaly Detection [53.06341151551106]
Anomaly detection becomes increasingly important for the dependability and serviceability of IT services. Existing unsupervised methods need anomaly examples to obtain a suitable decision boundary. We develop A2Log, which is an unsupervised anomaly detection method consisting of two steps: Anomaly scoring and anomaly decision.
arXiv Detail & Related papers (2021-09-20T13:40:21Z)
TELESTO: A Graph Neural Network Model for Anomaly Classification in Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance. One direction aims at the recognition of re-occurring anomaly types to enable remediation automation. We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z)
Anytime Diagnosis for Reconfiguration [52.77024349608834]
We introduce and analyze FlexDiag which is an anytime direct diagnosis approach. We evaluate the algorithm with regard to performance and diagnosis quality using a configuration benchmark from the domain of feature models and an industrial configuration knowledge base from the automotive domain.
arXiv Detail & Related papers (2021-02-19T11:45:52Z)
Self-Attentive Classification-Based Anomaly Detection in Unstructured Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations. We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z)
PyODDS: An End-to-end Outlier Detection System with Automated Machine Learning [55.32009000204512]
We present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support. Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space. It also provides unified interfaces and visualizations for users with or without data science or machine learning background.
arXiv Detail & Related papers (2020-03-12T03:30:30Z)
An Intelligent and Time-Efficient DDoS Identification Framework for Real-Time Enterprise Networks SAD-F: Spark Based Anomaly Detection Framework [0.5811502603310248]
We will be exploring security analytic techniques for DDoS anomaly detection using different machine learning techniques. In this paper, we are proposing a novel approach which deals with real traffic as input to the system. We study and compare the performance factor of our proposed framework on three different testbeds.
arXiv Detail & Related papers (2020-01-21T06:05:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.