Related papers: A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures

URL: http://arxiv.org/abs/2311.17815v1
Date: Wed, 29 Nov 2023 17:10:16 GMT
Title: A Survey on Design Methodologies for Accelerating Deep Learning on Heterogeneous Architectures
Authors: Fabrizio Ferrandi, Serena Curzel, Leandro Fiorin, Daniele Ielmini, Cristina Silvano, Francesco Conti, Alessio Burrello, Francesco Barchi, Luca Benini, Luciano Lavagno, Teodoro Urso, Enrico Calore, Sebastiano Fabio Schifano, Cristian Zambelli, Maurizio Palesi, Giuseppe Ascia, Enrico Russo, Nicola Petra, Davide De Caro, Gennaro Di Meo, Valeria Cardellini, Salvatore Filippone, Francesco Lo Presti, Francesco Silvestri, Paolo Palazzari and Stefania Perri
Abstract summary: The need for efficient hardware accelerators has become more pressing to design heterogeneous HPC platforms. Several methodologies and tools have been proposed to design accelerators for Deep Learning. This survey provides a holistic review of the most influential design methodologies and EDA tools proposed in recent years to implement Deep Learning accelerators.
Score: 9.982620766142345
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In recent years, the field of Deep Learning has seen many disruptive and impactful advancements. Given the increasing complexity of deep neural networks, the need for efficient hardware accelerators has become more and more pressing to design heterogeneous HPC platforms. The design of Deep Learning accelerators requires a multidisciplinary approach, combining expertise from several areas, spanning from computer architecture to approximate computing, computational models, and machine learning algorithms. Several methodologies and tools have been proposed to design accelerators for Deep Learning, including hardware-software co-design approaches, high-level synthesis methods, specific customized compilers, and methodologies for design space exploration, modeling, and simulation. These methodologies aim to maximize the exploitable parallelism and minimize data movement to achieve high performance and energy efficiency. This survey provides a holistic review of the most influential design methodologies and EDA tools proposed in recent years to implement Deep Learning accelerators, offering the reader a wide perspective in this rapidly evolving field. In particular, this work complements the previous survey proposed by the same authors in [203], which focuses on Deep Learning hardware accelerators for heterogeneous HPC platforms.

Related papers

Machine-Learning-Assisted Photonic Device Development: A Multiscale Approach from Theory to Characterization [80.82828320306464]
Photonic device development (PDD) has achieved remarkable success in designing and implementing new devices for controlling light across various wavelengths, scales, and applications.<n>PDD is an iterative, five-step process that consists of: i.e. deriving device behavior from design parameters, ii. simulating device performance, iv. fabricating the optimal device, and v. measuring device performance.<n>PDD suffers from large optimization landscapes, uncertainties in structural or optical characterization, and difficulties in implementing robust fabrication processes.<n>In this review, we present a comprehensive perspective on these methods to enable machine-learning-assisted PDD
arXiv Detail & Related papers (2025-06-24T23:32:54Z)
AI Agents in Engineering Design: A Multi-Agent Framework for Aesthetic and Aerodynamic Car Design [24.258618104493532]
We introduce the concept of "Design Agents" for engineering applications, particularly focusing on the automotive design process.<n>Our framework integrates AI-driven design agents into the traditional engineering workflow to augment creativity, enhance efficiency, and significantly accelerate the overall design cycle.
arXiv Detail & Related papers (2025-03-30T04:57:17Z)
On Accelerating Edge AI: Optimizing Resource-Constrained Environments [1.7355861031903428]
Resource-constrained edge deployments demand AI solutions that balance high performance with stringent compute, memory, and energy limitations. We present a comprehensive overview of the primary strategies for accelerating deep learning models under such constraints.
arXiv Detail & Related papers (2025-01-25T01:37:03Z)
Lightweight Design and Optimization methods for DCNNs: Progress and Futures [40.96453902709292]
Deep Convolutional Neural Networks (DCNNs) have demonstrated superior performance in computer vision tasks.<n>High computational costs and large network architectures severely limit the widespread application of DCNNs on resource-constrained hardware platforms.<n>This paper reviews lightweight design strategies for DCNNs and examines recent research progress in both lightweight architectural design and model compression.
arXiv Detail & Related papers (2024-12-22T06:47:01Z)
A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation. deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency. This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z)
Deep Learning and Machine Learning -- Object Detection and Semantic Segmentation: From Theory to Applications [17.571124565519263]
Book covers state-of-the-art advancements in machine learning and deep learning. Focuses on convolutional neural networks (CNNs), YOLO architectures, and transformer-based approaches like DETR. Book also delves into the integration of artificial intelligence (AI) techniques and large language models for enhanced object detection.
arXiv Detail & Related papers (2024-10-21T02:10:49Z)
Inference Optimization of Foundation Models on AI Accelerators [68.24450520773688]
Powerful foundation models, including large language models (LLMs), with Transformer architectures have ushered in a new era of Generative AI. As the number of model parameters reaches to hundreds of billions, their deployment incurs prohibitive inference costs and high latency in real-world scenarios. This tutorial offers a comprehensive discussion on complementary inference optimization techniques using AI accelerators.
arXiv Detail & Related papers (2024-07-12T09:24:34Z)
Geometric Deep Learning for Computer-Aided Design: A Survey [76.3325417461511]
Geometric Deep Learning techniques have become a transformative force in the field of Computer-Aided Design.<n>The ability to process the CAD designs represented by geometric data and to analyze their encoded features enables the identification of similarities.<n>This survey offers a comprehensive overview of learning-based methods in computer-aided design across various categories.
arXiv Detail & Related papers (2024-02-27T17:11:35Z)
Design Space Exploration of Approximate Computing Techniques with a Reinforcement Learning Approach [49.42371633618761]
We propose an RL-based strategy to find approximate versions of an application that balance accuracy degradation and power and computation time reduction. Our experimental results show a good trade-off between accuracy degradation and decreased power and computation time for some benchmarks.
arXiv Detail & Related papers (2023-12-29T09:10:40Z)
Computation-efficient Deep Learning for Computer Vision: A Survey [121.84121397440337]
Deep learning models have reached or even exceeded human-level performance in a range of visual perception tasks. Deep learning models usually demand significant computational resources, leading to impractical power consumption, latency, or carbon emissions in real-world scenarios. New research focus is computationally efficient deep learning, which strives to achieve satisfactory performance while minimizing the computational cost during inference.
arXiv Detail & Related papers (2023-08-27T03:55:28Z)
A Survey on Deep Learning Hardware Accelerators for Heterogeneous HPC Platforms [9.036774656254375]
This survey summarizes and classifies the most recent advances in designing deep learning accelerators. It highlights the most advanced approaches to support deep learning accelerations including not only GPU and TPU-based accelerators but also design-specific hardware accelerators. The survey also describes accelerators based on emerging memory technologies and computing paradigms.
arXiv Detail & Related papers (2023-06-27T15:24:24Z)
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review [90.87691246153612]
The field of deep learning has witnessed significant progress, particularly in computer vision (CV), natural language processing (NLP), and speech. The use of large-scale models trained on vast amounts of data holds immense promise for practical applications. With the increasing demands on computational capacity, a comprehensive summarization on acceleration techniques of training deep learning models is still much anticipated.
arXiv Detail & Related papers (2023-04-07T11:13:23Z)
Design of Convolutional Extreme Learning Machines for Vision-Based Navigation Around Small Bodies [0.0]
Deep learning architectures such as convolutional neural networks are the standard in computer vision for image processing tasks. Their accuracy however often comes at the cost of long and computationally expensive training. A different method known as convolutional extreme learning machine has shown the potential to perform equally with a dramatic decrease in training time.
arXiv Detail & Related papers (2022-10-28T16:24:21Z)
Hybrid Supervised and Reinforcement Learning for the Design and Optimization of Nanophotonic Structures [8.677532138573984]
This paper presents a hybrid supervised and reinforcement learning approach to the inverse design of nanophotonic structures. We show this approach can reduce training data dependence, improve the generalizability of model predictions, and shorten exploratory training times by orders of magnitude.
arXiv Detail & Related papers (2022-09-08T22:43:40Z)
GANDSE: Generative Adversarial Network based Design Space Exploration for Neural Network Accelerator Design [27.290616313982323]
We propose a neural network accelerator design automation framework named GANDSE. GANDSE is able to find the more optimized designs in negligible time compared with approaches including multilayer perceptron and deep reinforcement learning.
arXiv Detail & Related papers (2022-08-01T12:32:46Z)
Does Form Follow Function? An Empirical Exploration of the Impact of Deep Neural Network Architecture Design on Hardware-Specific Acceleration [76.35307867016336]
This study investigates the impact of deep neural network architecture design on the degree of inference speedup. We show that while leveraging hardware-specific acceleration achieved an average inference speed-up of 380%, the degree of inference speed-up varied drastically depending on the macro-architecture design pattern.
arXiv Detail & Related papers (2021-07-08T23:05:39Z)
Dynamically Grown Generative Adversarial Networks [111.43128389995341]
We propose a method to dynamically grow a GAN during training, optimizing the network architecture and its parameters together with automation. The method embeds architecture search techniques as an interleaving step with gradient-based training to periodically seek the optimal architecture-growing strategy for the generator and discriminator.
arXiv Detail & Related papers (2021-06-16T01:25:51Z)
Integrating Deep Learning in Domain Sciences at Exascale [2.241545093375334]
We evaluate existing packages for their ability to run deep learning models and applications on large-scale HPC systems efficiently. We propose new asynchronous parallelization and optimization techniques for current large-scale heterogeneous systems. We present illustrations and potential solutions for enhancing traditional compute- and data-intensive applications with AI.
arXiv Detail & Related papers (2020-11-23T03:09:58Z)
Knowledge Distillation: A Survey [87.51063304509067]
Deep neural networks have been successful in both industry and academia, especially for computer vision tasks. It is a challenge to deploy these cumbersome deep models on devices with limited resources. Knowledge distillation effectively learns a small student model from a large teacher model.
arXiv Detail & Related papers (2020-06-09T21:47:17Z)
Deep Learning and Knowledge-Based Methods for Computer Aided Molecular Design -- Toward a Unified Approach: State-of-the-Art and Future Directions [0.0]
The optimal design of compounds through manipulating properties at the molecular level is often the key to scientific advances and improved process systems performance. This paper highlights key trends, challenges, and opportunities underpinning the Computer-Aided Molecular Design problems.
arXiv Detail & Related papers (2020-05-18T14:17:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.