Related papers: Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

URL: http://arxiv.org/abs/2305.14516v2
Date: Fri, 26 May 2023 16:22:27 GMT
Title: Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces
Authors: Srinivas Sridharan, Taekyung Heo, Louis Feng, Zhaodong Wang, Matt Bergeron, Wenyin Fu, Shengbao Zheng, Brian Coutinho, Saeed Rashidi, Changhai Man, Tushar Krishna
Abstract summary: We propose Chakra, an open graph schema for standardizing workload specification capturing key operations and dependencies, also known as Execution Trace (ET) For instance, we use generative AI models to learn latent statistical properties across thousands of Chakra ETs and use these models to synthesize Chakra ETs. Our end-goal is to build a vibrant industry-wide ecosystem of agile benchmarks and tools to drive future AI system co-design.
Score: 5.692357167709513
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Benchmarking and co-design are essential for driving optimizations and innovation around ML models, ML software, and next-generation hardware. Full workload benchmarks, e.g. MLPerf, play an essential role in enabling fair comparison across different software and hardware stacks especially once systems are fully designed and deployed. However, the pace of AI innovation demands a more agile methodology to benchmark creation and usage by simulators and emulators for future system co-design. We propose Chakra, an open graph schema for standardizing workload specification capturing key operations and dependencies, also known as Execution Trace (ET). In addition, we propose a complementary set of tools/capabilities to enable collection, generation, and adoption of Chakra ETs by a wide range of simulators, emulators, and benchmarks. For instance, we use generative AI models to learn latent statistical properties across thousands of Chakra ETs and use these models to synthesize Chakra ETs. These synthetic ETs can obfuscate key proprietary information and also target future what-if scenarios. As an example, we demonstrate an end-to-end proof-of-concept that converts PyTorch ETs to Chakra ETs and uses this to drive an open-source training system simulator (ASTRA-sim). Our end-goal is to build a vibrant industry-wide ecosystem of agile benchmarks and tools to drive future AI system co-design.

Related papers

EdgeMark: An Automation and Benchmarking System for Embedded Artificial Intelligence Tools [0.0]
The integration of artificial intelligence (AI) into embedded devices is transforming industries by enabling intelligent data processing at the edge. This paper provides a review of existing eAI tools, highlighting their features, trade-offs, and limitations. We also introduce EdgeMark, an open-source automation system designed to streamline the benchmarking workflow for deploying and benchmarking machine learning (ML) models on embedded platforms.
arXiv Detail & Related papers (2025-02-03T08:28:01Z)
Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI. As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios. This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z)
CARLOS: An Open, Modular, and Scalable Simulation Framework for the Development and Testing of Software for C-ITS [0.0]
We propose CARLOS - an open, modular, and scalable simulation framework for the development and testing of software in C-ITS. We provide core building blocks for this framework and explain how it can be used and extended by the community. In our paper, we motivate the architecture by describing important design principles and showcasing three major use cases.
arXiv Detail & Related papers (2024-04-02T10:48:36Z)
DEAP: Design Space Exploration for DNN Accelerator Parallelism [0.0]
Large Language Models (LLMs) are becoming increasingly complex and powerful to train and serve. This paper showcases how hardware and software co-design can come together and allow us to create customized hardware systems.
arXiv Detail & Related papers (2023-12-24T02:43:01Z)
TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations. We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z)
Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks. We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z)
Mystique: Enabling Accurate and Scalable Generation of Production AI Benchmarks [2.0315147707806283]
Mystique is an accurate and scalable framework for production AI benchmark generation. Mystique is scalable, due to its lightweight data collection, in terms of overhead runtime and instrumentation effort. We evaluate our methodology on several production AI models, and show that benchmarks generated with Mystique closely resemble original AI models.
arXiv Detail & Related papers (2022-12-16T18:46:37Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation [98.51313127382937]
We focus on the use of labels in the synthetic domain alone. Our approach introduces both a way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator. We showcase our approach on the bird's-eye-view vehicle segmentation task with multi-sensor data.
arXiv Detail & Related papers (2021-11-15T18:37:43Z)
A Model-Driven Engineering Approach to Machine Learning and Software Modeling [0.5156484100374059]
Models are used in both the Software Engineering (SE) and the Artificial Intelligence (AI) communities. The main focus is on the Internet of Things (IoT) and smart Cyber-Physical Systems (CPS) use cases, where both ML and model-driven SE play a key role.
arXiv Detail & Related papers (2021-07-06T15:50:50Z)
RecSim NG: Toward Principled Uncertainty Modeling for Recommender Ecosystems [35.302081092634985]
RecSim NG is a probabilistic platform for the simulation of recommender systems. It offers tools for inference and latent-variable model learning. It can be used to create transparent, end-to-end models of a recommender ecosystem.
arXiv Detail & Related papers (2021-03-14T22:37:42Z)
A User's Guide to Calibrating Robotics Simulators [54.85241102329546]
This paper proposes a set of benchmarks and a framework for the study of various algorithms aimed to transfer models and policies learnt in simulation to the real world. We conduct experiments on a wide range of well known simulated environments to characterize and offer insights into the performance of different algorithms. Our analysis can be useful for practitioners working in this area and can help make informed choices about the behavior and main properties of sim-to-real algorithms.
arXiv Detail & Related papers (2020-11-17T22:24:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.