Feature Engineering for Scalable Application-Level Post-Silicon
Debugging
- URL: http://arxiv.org/abs/2102.04554v1
- Date: Mon, 8 Feb 2021 22:11:59 GMT
- Title: Feature Engineering for Scalable Application-Level Post-Silicon
Debugging
- Authors: Debjit Pal, Shobha Vasudevan
- Abstract summary: We present solutions for both observability enhancement and root-cause diagnosis of post-silicon System-on-Chips (SoCs) validation.
We model specification of interacting flows in typical applications for message selection.
We define diagnosis problem as identifying buggy traces as outliers and bug-free traces as inliers/normal behaviors.
- Score: 0.456877715768796
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present systematic and efficient solutions for both observability
enhancement and root-cause diagnosis of post-silicon System-on-Chips (SoCs)
validation with diverse usage scenarios. We model specification of interacting
flows in typical applications for message selection. Our method for message
selection optimizes flow specification coverage and trace buffer utilization.
We define the diagnosis problem as identifying buggy traces as outliers and
bug-free traces as inliers/normal behaviors, for which we use unsupervised
learning algorithms for outlier detection. Instead of direct application of
machine learning algorithms over trace data using the signals as raw features,
we use feature engineering to transform raw features into more sophisticated
features using domain specific operations. The engineered features are highly
relevant to the diagnosis task and are generic to be applied across any
hardware designs. We present debugging and root cause analysis of subtle
post-silicon bugs in industry-scale OpenSPARC T2 SoC. We achieve a trace buffer
utilization of 98.96\% with a flow specification coverage of 94.3\% (average).
Our diagnosis method was able to diagnose up to 66.7\% more bugs and took up to
847$\times$ less diagnosis time as compared to the manual debugging with a
diagnosis precision of 0.769.
Related papers
- PULL: Reactive Log Anomaly Detection Based On Iterative PU Learning [58.85063149619348]
We propose PULL, an iterative log analysis method for reactive anomaly detection based on estimated failure time windows.
Our evaluation shows that PULL consistently outperforms ten benchmark baselines across three different datasets.
arXiv Detail & Related papers (2023-01-25T16:34:43Z) - Deep Scattering Spectrum germaneness to Fault Detection and Diagnosis
for Component-level Prognostics and Health Management (PHM) [0.0]
This work focuses on the study of the Deep Scattering Spectrum (DSS)'s relevance to fault detection and daignosis for mechanical components of industrail robots.
We used multiple industrial robots and distinct mechanical faults to build an approach for classifying the faults.
The presented approach was implemented on the practical test benches and demonstrated satisfactory performance in fault detection and diagnosis.
arXiv Detail & Related papers (2022-10-18T13:25:02Z) - SensorSCAN: Self-Supervised Learning and Deep Clustering for Fault
Diagnosis in Chemical Processes [2.398451252047814]
We propose SensorSCAN, a novel method for unsupervised fault detection and diagnosis.
We demonstrate our model's performance on two publicly available datasets of the Tennessee Eastman Process with various faults.
Our method is suitable for real-world applications where the number of faults is not known in advance.
arXiv Detail & Related papers (2022-08-17T10:24:37Z) - Leveraging Log Instructions in Log-based Anomaly Detection [0.5949779668853554]
We propose a method for reliable and practical anomaly detection from system logs.
It overcomes the common disadvantage of related works by building an anomaly detection model with log instructions from the source code of 1000+ GitHub projects.
The proposed method, named ADLILog, combines the log instructions and the data from the system of interest (target system) to learn a deep neural network model.
arXiv Detail & Related papers (2022-07-07T10:22:10Z) - Sintel: A Machine Learning Framework to Extract Insights from Signals [13.04826679898367]
We introduce Sintel, a machine learning framework for end-to-end time series tasks such as anomaly detection.
Sintel logs the entire anomaly detection journey, providing detailed documentation of anomalies over time.
It enables users to analyze signals, compare methods, and investigate anomalies through an interactive visualization tool.
arXiv Detail & Related papers (2022-04-19T19:38:27Z) - A2Log: Attentive Augmented Log Anomaly Detection [53.06341151551106]
Anomaly detection becomes increasingly important for the dependability and serviceability of IT services.
Existing unsupervised methods need anomaly examples to obtain a suitable decision boundary.
We develop A2Log, which is an unsupervised anomaly detection method consisting of two steps: Anomaly scoring and anomaly decision.
arXiv Detail & Related papers (2021-09-20T13:40:21Z) - TELESTO: A Graph Neural Network Model for Anomaly Classification in
Cloud Services [77.454688257702]
Machine learning (ML) and artificial intelligence (AI) are applied on IT system operation and maintenance.
One direction aims at the recognition of re-occurring anomaly types to enable remediation automation.
We propose a method that is invariant to dimensionality changes of given data.
arXiv Detail & Related papers (2021-02-25T14:24:49Z) - Anytime Diagnosis for Reconfiguration [52.77024349608834]
We introduce and analyze FlexDiag which is an anytime direct diagnosis approach.
We evaluate the algorithm with regard to performance and diagnosis quality using a configuration benchmark from the domain of feature models and an industrial configuration knowledge base from the automotive domain.
arXiv Detail & Related papers (2021-02-19T11:45:52Z) - Self-Attentive Classification-Based Anomaly Detection in Unstructured
Logs [59.04636530383049]
We propose Logsy, a classification-based method to learn log representations.
We show an average improvement of 0.25 in the F1 score, compared to the previous methods.
arXiv Detail & Related papers (2020-08-21T07:26:55Z) - PyODDS: An End-to-end Outlier Detection System with Automated Machine
Learning [55.32009000204512]
We present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support.
Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space.
It also provides unified interfaces and visualizations for users with or without data science or machine learning background.
arXiv Detail & Related papers (2020-03-12T03:30:30Z) - An Intelligent and Time-Efficient DDoS Identification Framework for
Real-Time Enterprise Networks SAD-F: Spark Based Anomaly Detection Framework [0.5811502603310248]
We will be exploring security analytic techniques for DDoS anomaly detection using different machine learning techniques.
In this paper, we are proposing a novel approach which deals with real traffic as input to the system.
We study and compare the performance factor of our proposed framework on three different testbeds.
arXiv Detail & Related papers (2020-01-21T06:05:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.