Related papers: Machine Learning aided Computer Architecture Design for CNN Inferencing Systems

Machine Learning aided Computer Architecture Design for CNN Inferencing Systems

URL: http://arxiv.org/abs/2308.05364v1
Date: Thu, 10 Aug 2023 06:17:46 GMT
Title: Machine Learning aided Computer Architecture Design for CNN Inferencing Systems
Authors: Christopher A. Metz
Abstract summary: We develop a technique for forecasting the power and performance of CNNs during inference, with a MAPE of 5.03% and 5.94%, respectively. Our approach empowers computer architects to estimate power and performance in the early stages of development, reducing the necessity for numerous prototypes.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Efficient and timely calculations of Machine Learning (ML) algorithms are essential for emerging technologies like autonomous driving, the Internet of Things (IoT), and edge computing. One of the primary ML algorithms used in such systems is Convolutional Neural Networks (CNNs), which demand high computational resources. This requirement has led to the use of ML accelerators like GPGPUs to meet design constraints. However, selecting the most suitable accelerator involves Design Space Exploration (DSE), a process that is usually time-consuming and requires significant manual effort. Our work presents approaches to expedite the DSE process by identifying the most appropriate GPGPU for CNN inferencing systems. We have developed a quick and precise technique for forecasting the power and performance of CNNs during inference, with a MAPE of 5.03% and 5.94%, respectively. Our approach empowers computer architects to estimate power and performance in the early stages of development, reducing the necessity for numerous prototypes. This saves time and money while also improving the time-to-market period.

Related papers

Enabling Efficient Processing of Spiking Neural Networks with On-Chip Learning on Commodity Neuromorphic Processors for Edge AI Systems [5.343921650701002]
spiking neural network (SNN) algorithms on neuromorphic processors offer ultra-low power/energy AI computation. We propose a design methodology to enable efficient SNN processing on commodity neuromorphic processors.
arXiv Detail & Related papers (2025-04-01T16:52:03Z)
Model-free front-to-end training of a large high performance laser neural network [0.0]
We demonstrate a fully autonomous and parallel optical neural network (ONN) using off-the-shelf components. Our ONN is highly efficient and is scalable both in network size and inference bandwidth towards the GHz range. We show that our ONN can achieve high accuracy and convergence efficiency, even under limited hardware resources.
arXiv Detail & Related papers (2025-03-21T08:43:02Z)
Using Machine Learning To Identify Software Weaknesses From Software Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications. Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z)
Performance and Energy Consumption of Parallel Machine Learning Algorithms [0.0]
Machine learning models have achieved remarkable success in various real-world applications. Model training in machine learning requires large-scale data sets and multiple iterations before it can work properly. Parallelization of training algorithms is a common strategy to speed up the process of training.
arXiv Detail & Related papers (2023-05-01T13:04:39Z)
Partitioning Distributed Compute Jobs with Reinforcement Learning and Graph Neural Networks [58.720142291102135]
Large-scale machine learning models are bringing advances to a broad range of fields. Many of these models are too large to be trained on a single machine, and must be distributed across multiple devices. We show that maximum parallelisation is sub-optimal in relation to user-critical metrics such as throughput and blocking rate.
arXiv Detail & Related papers (2023-01-31T17:41:07Z)
Accelerating Machine Learning Training Time for Limit Order Book Prediction [0.0]
Financial firms are interested in simulation to discover whether a given algorithm involving financial machine learning will operate profitably. For this task, hardware acceleration is expected to speed up the time required for the financial machine learning researcher to obtain the results. A published Limit Order Book algorithm for predicting stock market direction is our subject, and the machine learning training process can be time-intensive. In the studied configuration, this leads to significantly faster training time allowing more efficient and extensive model development.
arXiv Detail & Related papers (2022-06-17T22:52:56Z)
Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications [46.97774949613859]
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI) However, their superior performance comes at the considerable cost of computational complexity. This paper provides an overview of efficient deep learning methods, systems and applications.
arXiv Detail & Related papers (2022-04-25T16:52:48Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead [14.313423044185583]
This paper introduces the key properties of two brain-inspired models like Deep Neural Network (DNN), and then analyzes techniques to produce efficient and high-performance designs. A single inference of a DL model may require billions of multiply-and-accumulated operations, making the DL extremely compute- and energy-hungry.
arXiv Detail & Related papers (2020-12-21T10:27:48Z)
Scheduling Real-time Deep Learning Services as Imprecise Computations [11.611969843191433]
The paper presents an efficient real-time scheduling algorithm for intelligent real-time edge services. These services perform machine intelligence tasks, such as voice recognition, LIDAR processing, or machine vision. We show that deep neural network can be cast as imprecise computations, each with a mandatory part and several optional parts.
arXiv Detail & Related papers (2020-11-02T16:43:04Z)
MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS) We employ a one-shot architecture search approach in order to obtain a reduced search cost. We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z)
Spiking Neural Networks Hardware Implementations and Challenges: a Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles. We present the state of the art of hardware implementations of spiking neural networks. We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.