Chakra: Advancing Performance Benchmarking and Co-design using
Standardized Execution Traces
- URL: http://arxiv.org/abs/2305.14516v2
- Date: Fri, 26 May 2023 16:22:27 GMT
- Title: Chakra: Advancing Performance Benchmarking and Co-design using
Standardized Execution Traces
- Authors: Srinivas Sridharan, Taekyung Heo, Louis Feng, Zhaodong Wang, Matt
Bergeron, Wenyin Fu, Shengbao Zheng, Brian Coutinho, Saeed Rashidi, Changhai
Man, Tushar Krishna
- Abstract summary: We propose Chakra, an open graph schema for standardizing workload specification capturing key operations and dependencies, also known as Execution Trace (ET)
For instance, we use generative AI models to learn latent statistical properties across thousands of Chakra ETs and use these models to synthesize Chakra ETs.
Our end-goal is to build a vibrant industry-wide ecosystem of agile benchmarks and tools to drive future AI system co-design.
- Score: 5.692357167709513
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Benchmarking and co-design are essential for driving optimizations and
innovation around ML models, ML software, and next-generation hardware. Full
workload benchmarks, e.g. MLPerf, play an essential role in enabling fair
comparison across different software and hardware stacks especially once
systems are fully designed and deployed. However, the pace of AI innovation
demands a more agile methodology to benchmark creation and usage by simulators
and emulators for future system co-design. We propose Chakra, an open graph
schema for standardizing workload specification capturing key operations and
dependencies, also known as Execution Trace (ET). In addition, we propose a
complementary set of tools/capabilities to enable collection, generation, and
adoption of Chakra ETs by a wide range of simulators, emulators, and
benchmarks. For instance, we use generative AI models to learn latent
statistical properties across thousands of Chakra ETs and use these models to
synthesize Chakra ETs. These synthetic ETs can obfuscate key proprietary
information and also target future what-if scenarios. As an example, we
demonstrate an end-to-end proof-of-concept that converts PyTorch ETs to Chakra
ETs and uses this to drive an open-source training system simulator
(ASTRA-sim). Our end-goal is to build a vibrant industry-wide ecosystem of
agile benchmarks and tools to drive future AI system co-design.
Related papers
- Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI.
As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios.
This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z) - CARLOS: An Open, Modular, and Scalable Simulation Framework for the Development and Testing of Software for C-ITS [0.0]
We propose CARLOS - an open, modular, and scalable simulation framework for the development and testing of software in C-ITS.
We provide core building blocks for this framework and explain how it can be used and extended by the community.
In our paper, we motivate the architecture by describing important design principles and showcasing three major use cases.
arXiv Detail & Related papers (2024-04-02T10:48:36Z) - DEAP: Design Space Exploration for DNN Accelerator Parallelism [0.0]
Large Language Models (LLMs) are becoming increasingly complex and powerful to train and serve.
This paper showcases how hardware and software co-design can come together and allow us to create customized hardware systems.
arXiv Detail & Related papers (2023-12-24T02:43:01Z) - TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations.
We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - Mystique: Enabling Accurate and Scalable Generation of Production AI
Benchmarks [2.0315147707806283]
Mystique is an accurate and scalable framework for production AI benchmark generation.
Mystique is scalable, due to its lightweight data collection, in terms of overhead runtime and instrumentation effort.
We evaluate our methodology on several production AI models, and show that benchmarks generated with Mystique closely resemble original AI models.
arXiv Detail & Related papers (2022-12-16T18:46:37Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Towards Optimal Strategies for Training Self-Driving Perception Models
in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone.
Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator.
We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z) - A Model-Driven Engineering Approach to Machine Learning and Software
Modeling [0.5156484100374059]
Models are used in both the Software Engineering (SE) and the Artificial Intelligence (AI) communities.
The main focus is on the Internet of Things (IoT) and smart Cyber-Physical Systems (CPS) use cases, where both ML and model-driven SE play a key role.
arXiv Detail & Related papers (2021-07-06T15:50:50Z) - RecSim NG: Toward Principled Uncertainty Modeling for Recommender
Ecosystems [35.302081092634985]
RecSim NG is a probabilistic platform for the simulation of recommender systems.
It offers tools for inference and latent-variable model learning.
It can be used to create transparent, end-to-end models of a recommender ecosystem.
arXiv Detail & Related papers (2021-03-14T22:37:42Z) - A User's Guide to Calibrating Robotics Simulators [54.85241102329546]
This paper proposes a set of benchmarks and a framework for the study of various algorithms aimed to transfer models and policies learnt in simulation to the real world.
We conduct experiments on a wide range of well known simulated environments to characterize and offer insights into the performance of different algorithms.
Our analysis can be useful for practitioners working in this area and can help make informed choices about the behavior and main properties of sim-to-real algorithms.
arXiv Detail & Related papers (2020-11-17T22:24:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.