Related papers: Debugging Machine Learning Pipelines

Debugging Machine Learning Pipelines

URL: http://arxiv.org/abs/2002.04640v1
Date: Tue, 11 Feb 2020 19:13:12 GMT
Title: Debugging Machine Learning Pipelines
Authors: Raoni Louren\c{c}o and Juliana Freire and Dennis Shasha
Abstract summary: Inferring the root cause of failures and unexpected behavior is challenging, usually requiring much human thought. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures.
Score: 11.696401543261892
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning tasks entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous or uninformative outputs, the pipeline may fail or produce incorrect results. Inferring the root cause of failures and unexpected behavior is challenging, usually requiring much human thought, and is both time-consuming and error-prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our source code and experimental data will be available for reproducibility and enhancement.

Related papers

PEEL the Layers and Find Yourself: Revisiting Inference-time Data Leakage for Residual Neural Networks [64.90981115460937]
This paper explores inference-time data leakage risks of deep neural networks (NNs) We propose a novel backward feature inversion method, textbfPEEL, which can effectively recover block-wise input features from the intermediate output of residual NNs. Our results show that PEEL outperforms the state-of-the-art recovery methods by an order of magnitude when evaluated by mean squared error (MSE)
arXiv Detail & Related papers (2025-04-08T20:11:05Z)
DeepLL: Considering Linear Logic for the Analysis of Deep Learning Experiments [0.0]
In this work we investigate the use of Linear Logic for the analysis of Deep Learning experiments. We show that primitives and operators of Linear Logic can be used to express: (i) an abstract representation of the control flow of an experiment, (ii) a set of available experimental resources, such as API calls to the underlying data-structures and hardware as well as (iii) reasoning rules about the correct consumption of resources during experiments.
arXiv Detail & Related papers (2024-12-30T22:38:56Z)
Generalization Error in Quantum Machine Learning in the Presence of Sampling Noise [0.8532753451809455]
Eigentask Learning is a framework for learning with infinite input training data in the presence of output sampling noise. We calculate the training and generalization errors of a generic quantum machine learning system when the input training dataset and output measurement sampling shots are both finite.
arXiv Detail & Related papers (2024-10-18T17:48:24Z)
Quantum Internet: Resource Estimation for Entanglement Routing [0.0]
We consider the problem of estimating the physical resources required for routing entanglement in a quantum network. We propose a novel way of accounting for experimental errors in the purification process. We show that the approximation works reasonably well over a wide-range of errors.
arXiv Detail & Related papers (2024-10-14T13:50:39Z)
When in Doubt, Cascade: Towards Building Efficient and Capable Guardrails [19.80434777786657]
We develop a synthetic pipeline to generate targeted and labeled data. We show that our method achieves competitive performance with a fraction of the cost in compute.
arXiv Detail & Related papers (2024-07-08T18:39:06Z)
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE [68.6018458996143]
We propose a more general dynamic network that can combine both quantization and early exit dynamic network: QuEE. Our algorithm can be seen as a form of soft early exiting or input-dependent compression. The crucial factor of our approach is accurate prediction of the potential accuracy improvement achievable through further computation.
arXiv Detail & Related papers (2024-06-20T15:25:13Z)
DeepFunction: Deep Metric Learning-based Imbalanced Classification for Diagnosing Threaded Pipe Connection Defects using Functional Data [6.688305507010403]
In modern manufacturing, most of the product lines are conforming. Few products are nonconforming but with different defect types. The identification of defect types can help further root cause diagnosis of production lines. We propose an innovative classification framework based on deep metric learning using functional data (DeepFunction)
arXiv Detail & Related papers (2024-04-04T09:55:11Z)
Root Causing Prediction Anomalies Using Explainable AI [3.970146574042422]
We present a novel application of explainable AI (XAI) for root-causing performance degradation in machine learning models. A single feature corruption can cause cascading feature, label and concept drifts. We have successfully applied this technique to improve the reliability of models used in personalized advertising.
arXiv Detail & Related papers (2024-03-04T19:38:50Z)
Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization. We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data. We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z)
R-Tuning: Instructing Large Language Models to Say `I Don't Know' [66.11375475253007]
Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges. Previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not. We present a new approach called Refusal-Aware Instruction Tuning (R-Tuning) Experimental results demonstrate R-Tuning effectively improves a model's ability to answer known questions and refrain from answering unknown questions.
arXiv Detail & Related papers (2023-11-16T08:45:44Z)
Doubly Robust Proximal Causal Learning for Continuous Treatments [56.05592840537398]
We propose a kernel-based doubly robust causal learning estimator for continuous treatments. We show that its oracle form is a consistent approximation of the influence function. We then provide a comprehensive convergence analysis in terms of the mean square error.
arXiv Detail & Related papers (2023-09-22T12:18:53Z)
Task-specific experimental design for treatment effect estimation [59.879567967089145]
Large randomised trials (RCTs) are the standard for causal inference. Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought. We develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications.
arXiv Detail & Related papers (2023-06-08T18:10:37Z)
Deep Learning based pipeline for anomaly detection and quality enhancement in industrial binder jetting processes [68.8204255655161]
Anomaly detection describes methods of finding abnormal states, instances or data points that differ from a normal value space. This paper contributes to a data-centric way of approaching artificial intelligence in industrial production.
arXiv Detail & Related papers (2022-09-21T08:14:34Z)
A Survey on Extraction of Causal Relations from Natural Language Text [9.317718453037667]
Cause-effect relations appear frequently in text, and curating cause-effect relations from text helps in building causal networks for predictive tasks. Existing causality extraction techniques include knowledge-based, statistical machine learning(ML)-based, and deep learning-based approaches.
arXiv Detail & Related papers (2021-01-16T10:49:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.