StreamBrain: An HPC Framework for Brain-like Neural Networks on CPUs,
GPUs and FPGAs
- URL: http://arxiv.org/abs/2106.05373v1
- Date: Wed, 9 Jun 2021 20:28:18 GMT
- Title: StreamBrain: An HPC Framework for Brain-like Neural Networks on CPUs,
GPUs and FPGAs
- Authors: Artur Podobas, Martin Svedin, Steven W. D. Chien, Ivy B. Peng, Naresh
Balaji Ravichandran, Pawel Herman, Anders Lansner, Stefano Markidis
- Abstract summary: StreamBrain is a framework that allows neural networks based on BCPNN to be practically deployed in High-Performance Computing systems.
We empirically demonstrate that StreamBrain can train the well-known ML benchmark dataset MNIST within seconds.
We are the first to demonstrate BCPNN on STL-10 size networks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The modern deep learning method based on backpropagation has surged in
popularity and has been used in multiple domains and application areas. At the
same time, there are other -- less-known -- machine learning algorithms with a
mature and solid theoretical foundation whose performance remains unexplored.
One such example is the brain-like Bayesian Confidence Propagation Neural
Network (BCPNN). In this paper, we introduce StreamBrain -- a framework that
allows neural networks based on BCPNN to be practically deployed in
High-Performance Computing systems. StreamBrain is a domain-specific language
(DSL), similar in concept to existing machine learning (ML) frameworks, and
supports backends for CPUs, GPUs, and even FPGAs. We empirically demonstrate
that StreamBrain can train the well-known ML benchmark dataset MNIST within
seconds, and we are the first to demonstrate BCPNN on STL-10 size networks. We
also show how StreamBrain can be used to train with custom floating-point
formats and illustrate the impact of using different bfloat variations on BCPNN
using FPGAs.
Related papers
- LookupFFN: Making Transformers Compute-lite for CPU inference [23.61144705380663]
GPU clusters are the de facto choice for training large deep neural network (DNN) models today.
Several reasons including ease of workflow, security and cost have led to efforts investigating whether CPUs may be viable for inference in routine use in many sectors of the industry.
We study a module which is a workhorse within modern architectures, GEMM based Feed Forward Networks (FFNs) and assess the extent to which it can be made compute- (or FLOP-) lite.
arXiv Detail & Related papers (2024-03-12T00:26:16Z) - Basic Binary Convolution Unit for Binarized Image Restoration Network [146.0988597062618]
In this study, we reconsider components in binary convolution, such as residual connection, BatchNorm, activation function, and structure, for image restoration tasks.
Based on our findings and analyses, we design a simple yet efficient basic binary convolution unit (BBCU)
Our BBCU significantly outperforms other BNNs and lightweight models, which shows that BBCU can serve as a basic unit for binarized IR networks.
arXiv Detail & Related papers (2022-10-02T01:54:40Z) - CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded
Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor.
In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z) - Sub-bit Neural Networks: Learning to Compress and Accelerate Binary
Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs.
SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space.
Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Learning on Hardware: A Tutorial on Neural Network Accelerators and
Co-Processors [0.0]
Deep neural networks (DNNs) have the advantage that they can take into account a large number of parameters, which enables them to solve complex tasks.
In computer vision and speech recognition, they have a better accuracy than common algorithms, and in some tasks, they boast an even higher accuracy than human experts.
With the progress of DNNs in recent years, many other fields of application such as diagnosis of diseases and autonomous driving are taking advantage of them.
arXiv Detail & Related papers (2021-04-19T12:50:27Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Overview of FPGA deep learning acceleration based on convolutional
neural network [0.76146285961466]
In recent years, deep learning has become more and more mature, and as a commonly used algorithm in deep learning, convolutional neural networks have been widely used in various visual tasks.
This article is a review article, which mainly introduces the related theories and algorithms of convolution.
It summarizes the application scenarios of several existing FPGA technologies based on convolutional neural networks, and mainly introduces the application of accelerators.
arXiv Detail & Related papers (2020-12-23T12:44:24Z) - Compiling ONNX Neural Network Models Using MLIR [51.903932262028235]
We present a preliminary report on our onnx-mlir compiler, which generates code for the inference of deep neural network models.
Onnx-mlir relies on the Multi-Level Intermediate Representation (MLIR) infrastructure recently integrated in the LLVM project.
arXiv Detail & Related papers (2020-08-19T05:28:08Z) - Exposing Hardware Building Blocks to Machine Learning Frameworks [4.56877715768796]
We focus on how to design topologies that complement such a view of neurons as unique functions.
We develop a library that supports training a neural network with custom sparsity and quantization.
arXiv Detail & Related papers (2020-04-10T14:26:00Z) - The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural
Language Understanding [97.85957811603251]
We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models.
Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks.
A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm.
arXiv Detail & Related papers (2020-02-19T03:05:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.