Related papers: Full-stack evaluation of Machine Learning inference workloads for RISC-V systems

Full-stack evaluation of Machine Learning inference workloads for RISC-V systems

URL: http://arxiv.org/abs/2405.15380v1
Date: Fri, 24 May 2024 09:24:46 GMT
Title: Full-stack evaluation of Machine Learning inference workloads for RISC-V systems
Authors: Debjyoti Bhattacharjee, Anmol, Tommaso Marinelli, Karan Pathak, Peter Kourzanov,
Abstract summary: This study evaluates the performance of a wide array of machine learning workloads on RISC-V architectures using gem5, an open-source architectural simulator. Leveraging an open-source compilation toolchain based on Multi-Level Intermediate Representation (MLIR), the research presents benchmarking results specifically focused on deep learning inference workloads.
Score: 0.2621434923709917
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Architectural simulators hold a vital role in RISC-V research, providing a crucial platform for workload evaluation without the need for costly physical prototypes. They serve as a dynamic environment for exploring innovative architectural concepts, enabling swift iteration and thorough analysis of performance metrics. As deep learning algorithms become increasingly pervasive, it is essential to benchmark new architectures with machine learning workloads. The diverse computational kernels used in deep learning algorithms highlight the necessity for a comprehensive compilation toolchain to map to target hardware platforms. This study evaluates the performance of a wide array of machine learning workloads on RISC-V architectures using gem5, an open-source architectural simulator. Leveraging an open-source compilation toolchain based on Multi-Level Intermediate Representation (MLIR), the research presents benchmarking results specifically focused on deep learning inference workloads. Additionally, the study sheds light on current limitations of gem5 when simulating RISC-V architectures, offering insights for future development and refinement.

Related papers

Meta knowledge assisted Evolutionary Neural Architecture Search [38.55611683982936]
This paper introduces an efficient EC-based NAS method to solve problems via an innovative meta-learning framework. An adaptive surrogate model is designed through an adaptive threshold to select the potential architectures. Experiments on CIFAR-10, CIFAR-100, and ImageNet1K datasets demonstrate that the proposed method achieves high performance comparable to that of many state-of-the-art peer methods.
arXiv Detail & Related papers (2025-04-30T11:43:07Z)
Leveraging Neural Graph Compilers in Machine Learning Research for Edge-Cloud Systems [5.241450170761232]
This work presents a comprehensive evaluation of neural network graph compilers across heterogeneous hardware platforms. Our systematic analysis reveals that graph compilers exhibit performance patterns highly dependent on both neural architecture and batch sizes. We introduce novel metrics to quantify a compiler's ability to mitigate performance friction as batch size increases.
arXiv Detail & Related papers (2025-04-28T19:02:16Z)
Small Vision-Language Models: A Survey on Compact Architectures and Techniques [0.28087862620958753]
The emergence of small vision-language models (sVLMs) marks a critical advancement in multimodal AI. This survey offers a taxonomy of architectures that highlight innovations in compact design and computational efficiency.
arXiv Detail & Related papers (2025-03-09T16:14:46Z)
Deep Learning and Machine Learning -- Object Detection and Semantic Segmentation: From Theory to Applications [17.571124565519263]
Book covers state-of-the-art advancements in machine learning and deep learning. Focuses on convolutional neural networks (CNNs), YOLO architectures, and transformer-based approaches like DETR. Book also delves into the integration of artificial intelligence (AI) techniques and large language models for enhanced object detection.
arXiv Detail & Related papers (2024-10-21T02:10:49Z)
Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives. We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis. We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z)
Comparison of Static Analysis Architecture Recovery Tools for Microservice Applications [43.358953895199264]
We will identify static analysis architecture recovery tools for microservice applications via a multi-vocal literature review. We will then execute them on a common dataset and compare the measured effectiveness in architecture recovery.
arXiv Detail & Related papers (2024-03-11T17:26:51Z)
Accelerating Computer Architecture Simulation through Machine Learning [0.07252027234425332]
This paper presents our approach to accelerate computer architecture simulation by leveraging machine learning techniques. Our proposed model utilizes a combination of application features and micro-architectural features to predict the performance of an application. We demonstrate the effectiveness of our approach by building and evaluating a machine learning model that offers significant speedup in architectural exploration.
arXiv Detail & Related papers (2024-02-28T23:00:57Z)
A Survey of Serverless Machine Learning Model Inference [0.0]
Generative AI, Computer Vision, and Natural Language Processing have led to an increased integration of AI models into various products. This survey aims to summarize and categorize the emerging challenges and optimization opportunities for large-scale deep learning serving systems.
arXiv Detail & Related papers (2023-11-22T18:46:05Z)
POPNASv3: a Pareto-Optimal Neural Architecture Search Solution for Image and Time Series Classification [8.190723030003804]
This article presents the third version of a sequential model-based NAS algorithm targeting different hardware environments and multiple classification tasks. Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks. The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.
arXiv Detail & Related papers (2022-12-13T17:14:14Z)
Remote Sensing Image Classification using Transfer Learning and Attention Based Deep Neural Network [59.86658316440461]
We propose a deep learning based framework for RSISC, which makes use of the transfer learning technique and multihead attention scheme. The proposed deep learning framework is evaluated on the benchmark NWPU-RESISC45 dataset and achieves the best classification accuracy of 94.7%.
arXiv Detail & Related papers (2022-06-20T10:05:38Z)
Retrieval-Enhanced Machine Learning [110.5237983180089]
We describe a generic retrieval-enhanced machine learning framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
arXiv Detail & Related papers (2022-05-02T21:42:45Z)
Distributed intelligence on the Edge-to-Cloud Continuum: A systematic literature review [62.997667081978825]
This review aims at providing a comprehensive vision of the main state-of-the-art libraries and frameworks for machine learning and data analytics available today. The main simulation, emulation, deployment systems, and testbeds for experimental research on the Edge-to-Cloud Continuum available today are also surveyed.
arXiv Detail & Related papers (2022-04-29T08:06:05Z)
Flashlight: Enabling Innovation in Tools for Machine Learning [50.63188263773778]
We introduce Flashlight, an open-source library built to spur innovation in machine learning tools and systems. We see Flashlight as a tool enabling research that can benefit widely used libraries downstream and bring machine learning and systems researchers closer together.
arXiv Detail & Related papers (2022-01-29T01:03:29Z)
A User's Guide to Calibrating Robotics Simulators [54.85241102329546]
This paper proposes a set of benchmarks and a framework for the study of various algorithms aimed to transfer models and policies learnt in simulation to the real world. We conduct experiments on a wide range of well known simulated environments to characterize and offer insights into the performance of different algorithms. Our analysis can be useful for practitioners working in this area and can help make informed choices about the behavior and main properties of sim-to-real algorithms.
arXiv Detail & Related papers (2020-11-17T22:24:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.