ContainerStress: Autonomous Cloud-Node Scoping Framework for Big-Data ML
Use Cases
- URL: http://arxiv.org/abs/2003.08011v1
- Date: Wed, 18 Mar 2020 01:51:42 GMT
- Title: ContainerStress: Autonomous Cloud-Node Scoping Framework for Big-Data ML
Use Cases
- Authors: Guang Chao Wang, Kenny Gross, and Akshay Subramaniam
- Abstract summary: OracleLabs has developed an automated framework that uses nested-loop Monte Carlo simulation to autonomously scale any size customer ML use cases.
OracleLabs and NVIDIA authors have collaborated on a ML benchmark study which analyzes the compute cost and GPU acceleration of any ML prognostic algorithm.
- Score: 0.2752817022620644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deploying big-data Machine Learning (ML) services in a cloud environment
presents a challenge to the cloud vendor with respect to the cloud container
configuration sizing for any given customer use case. OracleLabs has developed
an automated framework that uses nested-loop Monte Carlo simulation to
autonomously scale any size customer ML use cases across the range of cloud
CPU-GPU "Shapes" (configurations of CPUs and/or GPUs in Cloud containers
available to end customers). Moreover, the OracleLabs and NVIDIA authors have
collaborated on a ML benchmark study which analyzes the compute cost and GPU
acceleration of any ML prognostic algorithm and assesses the reduction of
compute cost in a cloud container comprising conventional CPUs and NVIDIA GPUs.
Related papers
- CloudHeatMap: Heatmap-Based Monitoring for Large-Scale Cloud Systems [1.1199585259018456]
This paper presents CloudHeatMap, a novel heatmap-based visualization tool for near-real-time monitoring of LCS health.
It offers intuitive visualizations of key metrics such as call volumes, response times, and HTTP response codes, enabling operators to quickly identify performance issues.
arXiv Detail & Related papers (2024-10-28T14:57:10Z) - PVContext: Hybrid Context Model for Point Cloud Compression [61.24130634750288]
We propose PVContext, a hybrid context model for effective octree-based point cloud compression.
PVContext comprises two components with distinct modalities: the Voxel Context, which accurately represents local geometric information using voxels, and the Point Context, which efficiently preserves global shape information from point clouds.
arXiv Detail & Related papers (2024-09-19T12:47:35Z) - Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment [56.44025052765861]
Large language models (LLMs) have revolutionized Natural Language Processing (NLP), but their size creates computational bottlenecks.
We introduce a novel approach to create accurate, sparse foundational versions of performant LLMs.
We show a total speedup on CPUs for sparse-quantized LLaMA models of up to 8.6x.
arXiv Detail & Related papers (2024-05-06T16:03:32Z) - PointMamba: A Simple State Space Model for Point Cloud Analysis [65.59944745840866]
We propose PointMamba, transferring the success of Mamba, a recent representative state space model (SSM), from NLP to point cloud analysis tasks.
Unlike traditional Transformers, PointMamba employs a linear complexity algorithm, presenting global modeling capacity while significantly reducing computational costs.
arXiv Detail & Related papers (2024-02-16T14:56:13Z) - FusionAI: Decentralized Training and Deploying LLMs with Massive
Consumer-Level GPUs [57.12856172329322]
We envision a decentralized system unlocking the potential vast untapped consumer-level GPU.
This system faces critical challenges, including limited CPU and GPU memory, low network bandwidth, the variability of peer and device heterogeneity.
arXiv Detail & Related papers (2023-09-03T13:27:56Z) - In Situ Framework for Coupling Simulation and Machine Learning with
Application to CFD [51.04126395480625]
Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations.
As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks.
This work offers a solution by simplifying this coupling and enabling in situ training and inference on heterogeneous clusters.
arXiv Detail & Related papers (2023-06-22T14:07:54Z) - CWD: A Machine Learning based Approach to Detect Unknown Cloud Workloads [3.523208537466129]
We develop a machine learning based technique to characterize, profile and predict workloads running in the cloud environment.
We also develop techniques to analyze the performance of the model in a standalone manner.
arXiv Detail & Related papers (2022-11-28T19:41:56Z) - Deployment of ML Models using Kubeflow on Different Cloud Providers [0.17205106391379021]
We create end-to-end Machine Learning models on Kubeflow in the form of pipelines.
We analyze various points including the ease of setup, deployment models, performance, limitations and features of the tool.
arXiv Detail & Related papers (2022-06-27T22:46:11Z) - Walle: An End-to-End, General-Purpose, and Large-Scale Production System
for Device-Cloud Collaborative Machine Learning [40.09527159285327]
We build the first end-to-end and general-purpose system, called Walle, for device-cloud collaborative machine learning (ML)
Walle consists of a deployment platform, distributing ML tasks to billion-scale devices in time; a data pipeline, efficiently preparing task input; and a compute container, providing a cross-platform and high-performance execution environment.
We evaluate Walle in practical e-commerce application scenarios to demonstrate its effectiveness, efficiency, and scalability.
arXiv Detail & Related papers (2022-05-30T03:43:35Z) - Auto-Split: A General Framework of Collaborative Edge-Cloud AI [49.750972428032355]
This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud.
To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.
arXiv Detail & Related papers (2021-08-30T08:03:29Z) - Machine Learning Algorithms for Active Monitoring of High Performance
Computing as a Service (HPCaaS) Cloud Environments [0.0]
This paper explores the viability of engineering applications running on a cloud infrastructure configured as an HPC platform.
The engineering applications considered in this work include MCNP6, a radiation transport code developed by Los Alamos National Laboratory, OpenFOAM, an open source computational fluid dynamics code, and CADONFS, a numerical implementation of the general number field sieve algorithm used for prime number factorization.
arXiv Detail & Related papers (2020-09-26T01:29:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.