Reproducible and Portable Big Data Analytics in the Cloud
- URL: http://arxiv.org/abs/2112.09762v1
- Date: Fri, 17 Dec 2021 20:52:03 GMT
- Title: Reproducible and Portable Big Data Analytics in the Cloud
- Authors: Xin Wang, Pei Guo, Xingyan Li, Jianwu Wang, Aryya Gangopadhyay, Carl
E. Busart, Jade Freeman
- Abstract summary: There are two main difficulties in reproducing big data applications in the cloud.
The first is how to automate end-to-end execution of big data analytics in the cloud.
The second is an application developed for one cloud, such as AWS or Azure, is difficult to reproduce in another cloud.
- Score: 4.948702463455218
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cloud computing has become a major approach to enable reproducible
computational experiments because of its support of on-demand hardware and
software resource provisioning. Yet there are still two main difficulties in
reproducing big data applications in the cloud. The first is how to automate
end-to-end execution of big data analytics in the cloud including virtual
distributed environment provisioning, network and security group setup, and big
data analytics pipeline description and execution. The second is an application
developed for one cloud, such as AWS or Azure, is difficult to reproduce in
another cloud, a.k.a. vendor lock-in problem. To tackle these problems, we
leverage serverless computing and containerization techniques for automatic
scalable big data application execution and reproducibility, and utilize the
adapter design pattern to enable application portability and reproducibility
across different clouds. Based on the approach, we propose and develop an
open-source toolkit that supports 1) on-demand distributed hardware and
software environment provisioning, 2) automatic data and configuration storage
for each execution, 3) flexible client modes based on user preferences, 4)
execution history query, and 5) simple reproducibility of existing executions
in the same environment or a different environment. We did extensive
experiments on both AWS and Azure using three big data analytics applications
that run on a virtual CPU/GPU cluster. Three main behaviors of our toolkit were
benchmarked: i) execution overhead ratio for reproducibility support, ii)
differences of reproducing the same application on AWS and Azure in terms of
execution time, budgetary cost and cost-performance ratio, iii) differences
between scale-out and scale-up approach for the same application on AWS and
Azure.
Related papers
- SeBS-Flow: Benchmarking Serverless Cloud Function Workflows [51.4200085836966]
We propose the first serverless workflow benchmarking suite SeBS-Flow.
SeBS-Flow includes six real-world application benchmarks and four microbenchmarks representing different computational patterns.
We conduct comprehensive evaluations on three major cloud platforms, assessing performance, cost, scalability, and runtime deviations.
arXiv Detail & Related papers (2024-10-04T14:52:18Z) - Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering.
Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications.
These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z) - Green AI: A Preliminary Empirical Study on Energy Consumption in DL
Models Across Different Runtime Infrastructures [56.200335252600354]
It is common practice to deploy pre-trained models on environments distinct from their native development settings.
This led to the introduction of interchange formats such as ONNX, which includes its infrastructure, and ONNX, which work as standard formats.
arXiv Detail & Related papers (2024-02-21T09:18:44Z) - Prism: Revealing Hidden Functional Clusters from Massive Instances in
Cloud Systems [32.18320298895805]
We propose to infer functional clusters of instances, i.e., groups of instances having similar functionalities.
We first conduct a pilot study on a large-scale cloud system, Huawei Cloud, demonstrating that instances having similar functionalities share similar communication and resource usage patterns.
Motivated by these findings, we formulate the identification of functional clusters as a clustering problem and propose a non-intrusive solution called Prism.
arXiv Detail & Related papers (2023-08-15T08:34:54Z) - A Unified Cloud-Enabled Discrete Event Parallel and Distributed
Simulation Architecture [0.7949705607963994]
We present a unified parallel and distributed M&S architecture with enough flexibility to deploy simulations in the Cloud.
Our framework is based on the Discrete Event System Specification (DEVS) formalism.
The performance of the parallel and distributed framework is tested using the xDEVS M&S tool and the DEVStone benchmark with up to eight computing nodes.
arXiv Detail & Related papers (2023-02-22T09:47:09Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Dynamic Network-Assisted D2D-Aided Coded Distributed Learning [59.29409589861241]
We propose a novel device-to-device (D2D)-aided coded federated learning method (D2D-CFL) for load balancing across devices.
We derive an optimal compression rate for achieving minimum processing time and establish its connection with the convergence time.
Our proposed method is beneficial for real-time collaborative applications, where the users continuously generate training data.
arXiv Detail & Related papers (2021-11-26T18:44:59Z) - Reproducible Performance Optimization of Complex Applications on the
Edge-to-Cloud Continuum [55.6313942302582]
We propose a methodology to support the optimization of real-life applications on the Edge-to-Cloud Continuum.
Our approach relies on a rigorous analysis of possible configurations in a controlled testbed environment to understand their behaviour.
Our methodology can be generalized to other applications in the Edge-to-Cloud Continuum.
arXiv Detail & Related papers (2021-08-04T07:35:14Z) - Dynamic Scheduling for Stochastic Edge-Cloud Computing Environments
using A3C learning and Residual Recurrent Neural Networks [30.61220416710614]
A-Advantage-Actor-Critic (A3C) learning is known to quickly adapt to dynamic scenarios with less data and Residual Recurrent Neural Network (R2N2) to quickly update model parameters.
We use the R2N2 architecture to capture a large number of host and task parameters together with temporal patterns to provide efficient scheduling decisions.
Experiments conducted on real-world data set show a significant improvement in terms of energy consumption, response time, ServiceLevelAgreement and running cost by 14.4%, 7.74%, 31.9%, and 4.64%, respectively.
arXiv Detail & Related papers (2020-09-01T13:36:34Z) - AI-based Resource Allocation: Reinforcement Learning for Adaptive
Auto-scaling in Serverless Environments [0.0]
Serverless computing has emerged as a compelling new paradigm of cloud computing models in recent years.
A common approach among both commercial and open source serverless computing platforms is workload-based auto-scaling.
In this paper we investigate the applicability of a reinforcement learning approach to request-based auto-scaling in a serverless framework.
arXiv Detail & Related papers (2020-05-29T06:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.