Unicorn: Reasoning about Configurable System Performance through the
lens of Causality
- URL: http://arxiv.org/abs/2201.08413v1
- Date: Thu, 20 Jan 2022 19:16:50 GMT
- Title: Unicorn: Reasoning about Configurable System Performance through the
lens of Causality
- Authors: Md Shahriar Iqbal, Rahul Krishna, Mohammad Ali Javidian, Baishakhi
Ray, Pooyan Jamshidi
- Abstract summary: We propose a new method, called Unicorn, which captures intricate interactions between configuration options across the software- hardware stack.
Experiments indicate that Unicorn outperforms state-of-the-art performance optimization and debug methods.
Unlike the existing methods, the learned causal performance models reliably predict performance for new environments.
- Score: 12.877523121932114
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern computer systems are highly configurable, with the variability space
sometimes larger than the number of atoms in the universe. Understanding and
reasoning about the performance behavior of highly configurable systems, due to
a vast variability space, is challenging. State-of-the-art methods for
performance modeling and analyses rely on predictive machine learning models,
therefore, they become (i) unreliable in unseen environments (e.g., different
hardware, workloads), and (ii) produce incorrect explanations. To this end, we
propose a new method, called Unicorn, which (a) captures intricate interactions
between configuration options across the software-hardware stack and (b)
describes how such interactions impact performance variations via causal
inference. We evaluated Unicorn on six highly configurable systems, including
three on-device machine learning systems, a video encoder, a database
management system, and a data analytics pipeline. The experimental results
indicate that Unicorn outperforms state-of-the-art performance optimization and
debugging methods. Furthermore, unlike the existing methods, the learned causal
performance models reliably predict performance for new environments.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - Third-Party Language Model Performance Prediction from Instruction [59.574169249307054]
Language model-based instruction-following systems have lately shown increasing performance on many benchmark tasks.
A user may easily prompt a model with an instruction without any idea of whether the responses should be expected to be accurate.
We propose a third party performance prediction framework, where a separate model is trained to predict the metric resulting from evaluating an instruction-following system on a task.
arXiv Detail & Related papers (2024-03-19T03:53:47Z) - Multi-Objective Optimization of Performance and Interpretability of
Tabular Supervised Machine Learning Models [0.9023847175654603]
Interpretability is quantified via three measures: feature sparsity, interaction sparsity of features, and sparsity of non-monotone feature effects.
We show that our framework is capable of finding diverse models that are highly competitive or outperform state-of-the-art XGBoost or Explainable Boosting Machine models.
arXiv Detail & Related papers (2023-07-17T00:07:52Z) - CAMEO: A Causal Transfer Learning Approach for Performance Optimization
of Configurable Computer Systems [16.75106122540052]
We propose CAMEO, a method that identifies invariant causal predictors under environmental changes.
We demonstrate significant performance improvements over state-of-the-art optimization methods in MLperf deep learning systems, a video analytics pipeline, and a database system.
arXiv Detail & Related papers (2023-06-13T16:28:37Z) - HINNPerf: Hierarchical Interaction Neural Network for Performance
Prediction of Configurable Systems [22.380061796355616]
HINNPerf is a novel hierarchical interaction neural network for performance prediction.
HINNPerf employs the embedding method and hierarchic network blocks to model the complicated interplay between configuration options.
Empirical results on 10 real-world systems show that our method statistically significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2022-04-08T08:52:23Z) - An Empirical Analysis of Backward Compatibility in Machine Learning
Systems [47.04803977692586]
We consider how updates, intended to improve ML models, can introduce new errors that can significantly affect downstream systems and users.
For example, updates in models used in cloud-based classification services, such as image recognition, can cause unexpected erroneous behavior.
arXiv Detail & Related papers (2020-08-11T08:10:58Z) - Learning to Simulate Complex Physics with Graph Networks [68.43901833812448]
We present a machine learning framework and model implementation that can learn to simulate a wide variety of challenging physical domains.
Our framework---which we term "Graph Network-based Simulators" (GNS)--represents the state of a physical system with particles, expressed as nodes in a graph, and computes dynamics via learned message-passing.
Our results show that our model can generalize from single-timestep predictions with thousands of particles during training, to different initial conditions, thousands of timesteps, and at least an order of magnitude more particles at test time.
arXiv Detail & Related papers (2020-02-21T16:44:28Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.