Related papers: STONNE: A Detailed Architectural Simulator for Flexible Neural Network Accelerators

STONNE: A Detailed Architectural Simulator for Flexible Neural Network Accelerators

URL: http://arxiv.org/abs/2006.07137v1
Date: Wed, 10 Jun 2020 19:20:52 GMT
Title: STONNE: A Detailed Architectural Simulator for Flexible Neural Network Accelerators
Authors: Francisco Mu\~noz-Mart\'inez, Jos\'e L. Abell\'an, Manuel E. Acacio, Tushar Krishna
Abstract summary: STONNE is a cycle-accurate, highly-modular and highly-extensible simulation framework. We show how it can closely approach the performance results of the publicly available BSV-coded MAERI implementation.
Score: 5.326345912766044
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The design of specialized architectures for accelerating the inference procedure of Deep Neural Networks (DNNs) is a booming area of research nowadays. First-generation rigid proposals have been rapidly replaced by more advanced flexible accelerator architectures able to efficiently support a variety of layer types and dimensions. As the complexity of the designs grows, it is more and more appealing for researchers to have cycle-accurate simulation tools at their disposal to allow for fast and accurate design-space exploration, and rapid quantification of the efficacy of architectural enhancements during the early stages of a design. To this end, we present STONNE (Simulation TOol of Neural Network Engines), a cycle-accurate, highly-modular and highly-extensible simulation framework that enables end-to-end evaluation of flexible accelerator architectures running complete contemporary DNN models. We use STONNE to model the recently proposed MAERI architecture and show how it can closely approach the performance results of the publicly available BSV-coded MAERI implementation. Then, we conduct a comprehensive evaluation and demonstrate that the folding strategy implemented for MAERI results in very low compute unit utilization (25% on average across 5 DNN models) which in the end translates into poor performance.

Related papers

A Realistic Simulation Framework for Analog/Digital Neuromorphic Architectures [73.65190161312555]
ARCANA is a spiking neural network simulator designed to account for the properties of mixed-signal neuromorphic circuits. We show how the results obtained provide a reliable estimate of the behavior of the spiking neural network trained in software.
arXiv Detail & Related papers (2024-09-23T11:16:46Z)
Automatic Generation of Fast and Accurate Performance Models for Deep Neural Network Accelerators [33.18173790144853]
We present an automated generation approach for fast performance models to accurately estimate the latency of a Deep Neural Networks (DNNs) We modeled representative DNN accelerators such as Gemmini, UltraTrail, Plasticine-derived, and a parameterizable systolic array. We evaluate only 154 loop kernel iterations to estimate the performance for 4.19 billion instructions achieving a significant speedup.
arXiv Detail & Related papers (2024-09-13T07:27:55Z)
GPU-RANC: A CUDA Accelerated Simulation Framework for Neuromorphic Architectures [1.3401966602181168]
We introduce the GPU-based implementation of Reconfigurable Architecture for Neuromorphic Computing (RANC) We demonstrate up to 780 times speedup compared to serial version of the RANC simulator based on a 512 neuromorphic core MNIST inference application.
arXiv Detail & Related papers (2024-04-24T21:08:21Z)
Mechanistic Design and Scaling of Hybrid Architectures [114.3129802943915]
We identify and test new hybrid architectures constructed from a variety of computational primitives. We experimentally validate the resulting architectures via an extensive compute-optimal and a new state-optimal scaling law analysis. We find MAD synthetics to correlate with compute-optimal perplexity, enabling accurate evaluation of new architectures.
arXiv Detail & Related papers (2024-03-26T16:33:12Z)
Neural Architecture Codesign for Fast Bragg Peak Analysis [1.7081438846690533]
We develop an automated pipeline to streamline neural architecture codesign for fast, real-time Bragg peak analysis in microscopy. Our method employs neural architecture search and AutoML to enhance these models, including hardware costs, leading to the discovery of more hardware-efficient neural architectures.
arXiv Detail & Related papers (2023-12-10T19:42:18Z)
Disentangling Structured Components: Towards Adaptive, Interpretable and Scalable Time Series Forecasting [52.47493322446537]
We develop a adaptive, interpretable and scalable forecasting framework, which seeks to individually model each component of the spatial-temporal patterns. SCNN works with a pre-defined generative process of MTS, which arithmetically characterizes the latent structure of the spatial-temporal patterns. Extensive experiments are conducted to demonstrate that SCNN can achieve superior performance over state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2023-05-22T13:39:44Z)
SimVPv2: Towards Simple yet Powerful Spatiotemporal Predictive Learning [61.419914155985886]
We propose SimVPv2, a streamlined model that eliminates the need for Unet architectures for spatial and temporal modeling. SimVPv2 not only simplifies the model architecture but also improves both performance and computational efficiency. On the standard Moving MNIST benchmark, SimVPv2 achieves superior performance compared to SimVP, with fewer FLOPs, about half the training time and 60% faster inference efficiency.
arXiv Detail & Related papers (2022-11-22T08:01:33Z)
Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling. We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z)
Dynamically Grown Generative Adversarial Networks [111.43128389995341]
We propose a method to dynamically grow a GAN during training, optimizing the network architecture and its parameters together with automation. The method embeds architecture search techniques as an interleaving step with gradient-based training to periodically seek the optimal architecture-growing strategy for the generator and discriminator.
arXiv Detail & Related papers (2021-06-16T01:25:51Z)
NeuroXplorer 1.0: An Extensible Framework for Architectural Exploration with Spiking Neural Networks [3.9121275263540087]
We present NeuroXplorer, a framework that is based on a generalized template for modeling a neuromorphic architecture. NeuroXplorer can perform both low-level cycle-accurate architectural simulations and high-level analysis with data-flow abstractions. We demonstrate the architectural exploration capabilities of NeuroXplorer through case studies with many state-of-the-art machine learning models.
arXiv Detail & Related papers (2021-05-04T23:31:11Z)
A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures. A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.