Related papers: An Analysis of Distributed Systems Syllabi With a Focus on Performance-Related Topics

An Analysis of Distributed Systems Syllabi With a Focus on Performance-Related Topics

URL: http://arxiv.org/abs/2103.01858v1
Date: Tue, 2 Mar 2021 16:49:09 GMT
Title: An Analysis of Distributed Systems Syllabi With a Focus on Performance-Related Topics
Authors: Cristina L. Abad and Alexandru Iosup and Edwin F. Boza and Eduardo Ortiz-Holguin
Abstract summary: We analyze a dataset of 51 current ( 2019-2020) Distributed Systems syllabi from top Computer Science programs. We study the scale of the infrastructure mentioned in DS courses, from small client-server systems to cloud-scale, peer-to-peer, global-scale systems.
Score: 65.86247008403002
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We analyze a dataset of 51 current (2019-2020) Distributed Systems syllabi from top Computer Science programs, focusing on finding the prevalence and context in which topics related to performance are being taught in these courses. We also study the scale of the infrastructure mentioned in DS courses, from small client-server systems to cloud-scale, peer-to-peer, global-scale systems. We make eight main findings, covering goals such as performance, and scalability and its variant elasticity; activities such as performance benchmarking and monitoring; eight selected performance-enhancing techniques (replication, caching, sharding, load balancing, scheduling, streaming, migrating, and offloading); and control issues such as trade-offs that include performance and performance variability.

Related papers

The Impact of Feature Scaling In Machine Learning: Effects on Regression and Classification Tasks [0.8388858079753069]
This research addresses the critical lack of comprehensive studies on feature scaling by systematically evaluating 12 scaling techniques across 14 different Machine Learning algorithms and 16 datasets for classification and regression tasks.<n>We meticulously analyzed impacts on predictive performance (using metrics such as accuracy, MAE, MSE, and $R2$) and computational costs (training time, inference time, and memory usage).
arXiv Detail & Related papers (2025-06-09T22:32:51Z)
Refining Filter Global Feature Weighting for Fully-Unsupervised Clustering [0.0]
In unsupervised learning, effective clustering plays a vital role in revealing patterns and insights from unlabeled data. This paper explores feature weighting for clustering and presents new weighting strategies, including methods based on SHAP (SHapley Additive exPlanations) Our empirical evaluations demonstrate that feature weighting based on SHAP can enhance unsupervised clustering quality, achieving up to a 22.69% improvement over other weighting methods.
arXiv Detail & Related papers (2025-03-12T13:14:09Z)
Optimizing Decentralized Online Learning for Supervised Regression and Classification Problems [0.0]
Decentralized learning networks aim to synthesize a single network inference from a set of raw inferences provided by multiple participants. Despite the increased prevalence of decentralized learning networks, there exists no systematic study that performs a calibration of the associated free parameters. Here we present an optimization framework for key parameters governing decentralized online learning in supervised regression and classification problems.
arXiv Detail & Related papers (2025-01-27T21:36:54Z)
Online Continual Learning: A Systematic Literature Review of Approaches, Challenges, and Benchmarks [1.3631535881390204]
Online Continual Learning (OCL) is a critical area in machine learning. This study conducts the first comprehensive Systematic Literature Review on OCL.
arXiv Detail & Related papers (2025-01-09T01:03:14Z)
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference [2.2231908139555734]
We propose a general performance modeling methodology and workload analysis of distributed LLM training and inference. We validate our performance predictions with published data from literature and relevant industry vendors (e.g., NVIDIA)
arXiv Detail & Related papers (2024-07-19T19:49:05Z)
Rethinking Resource Management in Edge Learning: A Joint Pre-training and Fine-tuning Design Paradigm [87.47506806135746]
In some applications, edge learning is experiencing a shift in focusing from conventional learning from scratch to new two-stage learning. This paper considers the problem of joint communication and computation resource management in a two-stage edge learning system. It is shown that the proposed joint resource management over the pre-training and fine-tuning stages well balances the system performance trade-off.
arXiv Detail & Related papers (2024-04-01T00:21:11Z)
MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation [8.727456619750983]
The strategic integration of a visual prior into the training dataset emerges as a potential solution to enhance congruity with the testing data distribution. Our empirical evaluations underscore the efficacy of MISS, demonstrating commendable performance in scenarios characterized by limited data availability and memory constraints.
arXiv Detail & Related papers (2024-03-18T08:52:23Z)
Deep Configuration Performance Learning: A Systematic Survey and Taxonomy [3.077531983369872]
We conduct a comprehensive review on the topic of deep learning for performance learning of software, covering 1,206 searched papers spanning six indexing services. Our results outline key statistics, taxonomy, strengths, weaknesses, and optimal usage scenarios for techniques related to the preparation of configuration data. We also identify the good practices and potentially problematic phenomena from the studies surveyed, together with a comprehensive summary of actionable suggestions and insights into future opportunities within the field.
arXiv Detail & Related papers (2024-03-05T21:05:16Z)
On the Trade-off of Intra-/Inter-class Diversity for Supervised Pre-training [72.8087629914444]
We study the impact of the trade-off between the intra-class diversity (the number of samples per class) and the inter-class diversity (the number of classes) of a supervised pre-training dataset. With the size of the pre-training dataset fixed, the best downstream performance comes with a balance on the intra-/inter-class diversity.
arXiv Detail & Related papers (2023-05-20T16:23:50Z)
Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review [62.997667081978825]
This review aims at providing a comprehensive vision of the main state-of-the-art libraries and frameworks for machine learning and data analytics available today. The main simulation, emulation, deployment systems, and testbeds for experimental research on the Edge-to-Cloud Continuum available today are also surveyed.
arXiv Detail & Related papers (2022-04-29T08:06:05Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
MLPerfTM HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems [32.621917787044396]
We introduceerf HPC, a benchmark suite of scientific machine learning training applications driven by the MLCommonsTM Association. We develop a systematic framework for their joint analysis and compare them in terms of data staging, algorithmic convergence, and compute performance. We conclude by characterizing each benchmark with respect to low-level memory, I/O, and network behavior.
arXiv Detail & Related papers (2021-10-21T20:30:12Z)
An Extensible Benchmark Suite for Learning to Simulate Physical Systems [60.249111272844374]
We introduce a set of benchmark problems to take a step towards unified benchmarks and evaluation protocols. We propose four representative physical systems, as well as a collection of both widely used classical time-based and representative data-driven methods.
arXiv Detail & Related papers (2021-08-09T17:39:09Z)
Machine Learning-based Orchestration of Containers: A Taxonomy and Future Directions [25.763692543206773]
Existing mainstream cloud service providers have prevalently adopted container technologies in their distributed system infrastructures for automated application management. To handle the automation of deployment, maintenance, autoscaling, and networking of containerized applications, container orchestration is proposed as an essential research problem. In this paper, we present a comprehensive review of existing machine learning-based container orchestration approaches.
arXiv Detail & Related papers (2021-06-24T02:55:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.