Machine Learning aided Computer Architecture Design for CNN Inferencing
Systems
- URL: http://arxiv.org/abs/2308.05364v1
- Date: Thu, 10 Aug 2023 06:17:46 GMT
- Title: Machine Learning aided Computer Architecture Design for CNN Inferencing
Systems
- Authors: Christopher A. Metz
- Abstract summary: We develop a technique for forecasting the power and performance of CNNs during inference, with a MAPE of 5.03% and 5.94%, respectively.
Our approach empowers computer architects to estimate power and performance in the early stages of development, reducing the necessity for numerous prototypes.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Efficient and timely calculations of Machine Learning (ML) algorithms are
essential for emerging technologies like autonomous driving, the Internet of
Things (IoT), and edge computing. One of the primary ML algorithms used in such
systems is Convolutional Neural Networks (CNNs), which demand high
computational resources. This requirement has led to the use of ML accelerators
like GPGPUs to meet design constraints. However, selecting the most suitable
accelerator involves Design Space Exploration (DSE), a process that is usually
time-consuming and requires significant manual effort. Our work presents
approaches to expedite the DSE process by identifying the most appropriate
GPGPU for CNN inferencing systems. We have developed a quick and precise
technique for forecasting the power and performance of CNNs during inference,
with a MAPE of 5.03% and 5.94%, respectively. Our approach empowers computer
architects to estimate power and performance in the early stages of
development, reducing the necessity for numerous prototypes. This saves time
and money while also improving the time-to-market period.
Related papers
- Using Machine Learning To Identify Software Weaknesses From Software
Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications.
Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z) - Performance and Energy Consumption of Parallel Machine Learning
Algorithms [0.0]
Machine learning models have achieved remarkable success in various real-world applications.
Model training in machine learning requires large-scale data sets and multiple iterations before it can work properly.
Parallelization of training algorithms is a common strategy to speed up the process of training.
arXiv Detail & Related papers (2023-05-01T13:04:39Z) - Partitioning Distributed Compute Jobs with Reinforcement Learning and
Graph Neural Networks [58.720142291102135]
Large-scale machine learning models are bringing advances to a broad range of fields.
Many of these models are too large to be trained on a single machine, and must be distributed across multiple devices.
We show that maximum parallelisation is sub-optimal in relation to user-critical metrics such as throughput and blocking rate.
arXiv Detail & Related papers (2023-01-31T17:41:07Z) - Accelerating Machine Learning Training Time for Limit Order Book
Prediction [0.0]
Financial firms are interested in simulation to discover whether a given algorithm involving financial machine learning will operate profitably.
For this task, hardware acceleration is expected to speed up the time required for the financial machine learning researcher to obtain the results.
A published Limit Order Book algorithm for predicting stock market direction is our subject, and the machine learning training process can be time-intensive.
In the studied configuration, this leads to significantly faster training time allowing more efficient and extensive model development.
arXiv Detail & Related papers (2022-06-17T22:52:56Z) - Enable Deep Learning on Mobile Devices: Methods, Systems, and
Applications [46.97774949613859]
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI)
However, their superior performance comes at the considerable cost of computational complexity.
This paper provides an overview of efficient deep learning methods, systems and applications.
arXiv Detail & Related papers (2022-04-25T16:52:48Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - Hardware and Software Optimizations for Accelerating Deep Neural
Networks: Survey of Current Trends, Challenges, and the Road Ahead [14.313423044185583]
This paper introduces the key properties of two brain-inspired models like Deep Neural Network (DNN), and then analyzes techniques to produce efficient and high-performance designs.
A single inference of a DL model may require billions of multiply-and-accumulated operations, making the DL extremely compute- and energy-hungry.
arXiv Detail & Related papers (2020-12-21T10:27:48Z) - Scheduling Real-time Deep Learning Services as Imprecise Computations [11.611969843191433]
The paper presents an efficient real-time scheduling algorithm for intelligent real-time edge services.
These services perform machine intelligence tasks, such as voice recognition, LIDAR processing, or machine vision.
We show that deep neural network can be cast as imprecise computations, each with a mandatory part and several optional parts.
arXiv Detail & Related papers (2020-11-02T16:43:04Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - Spiking Neural Networks Hardware Implementations and Challenges: a
Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles.
We present the state of the art of hardware implementations of spiking neural networks.
We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.