NeuralMatrix: Compute the Entire Neural Networks with Linear Matrix
Operations for Efficient Inference
- URL: http://arxiv.org/abs/2305.14405v3
- Date: Thu, 8 Feb 2024 10:11:27 GMT
- Title: NeuralMatrix: Compute the Entire Neural Networks with Linear Matrix
Operations for Efficient Inference
- Authors: Ruiqi Sun, Siwei Ye, Jie Zhao, Xin He, Yiran Li, An Zou
- Abstract summary: We present NeuralMatrix, a framework that transforms the computation of entire Deep Neural Network (DNN) models into linear matrix operations.
Our approach preserves network accuracy while providing both generality and application-specific levels of computation efficiency.
- Score: 20.53515208166353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The inherent diversity of computation types within individual Deep Neural
Network (DNN) models imposes a corresponding need for a varied set of
computation units within hardware processors. This diversity poses a
significant constraint on computation efficiency during the execution of
different neural networks. In this study, we present NeuralMatrix, a framework
that transforms the computation of entire DNNs into linear matrix operations.
This transformation seamlessly enables the execution of various DNN models
using a single General-Purpose Matrix Multiplication (GEMM) accelerator.
Extensive experimental results spanning different DNN models demonstrate that
our approach preserves network accuracy while providing both generality and
application-specific levels of computation efficiency. This allows a broad
spectrum of DNN models to be executed using a single GEMM accelerator,
eliminating the need for additional special function units.
Related papers
- EvSegSNN: Neuromorphic Semantic Segmentation for Event Data [0.6138671548064356]
EvSegSNN is a biologically plausible encoder-decoder U-shaped architecture relying on Parametric Leaky Integrate and Fire neurons.
We introduce an end-to-end biologically inspired semantic segmentation approach by combining Spiking Neural Networks with event cameras.
Experiments conducted on DDD17 demonstrate that EvSegSNN outperforms the closest state-of-the-art model in terms of MIoU.
arXiv Detail & Related papers (2024-06-20T10:36:24Z) - Feed-Forward Neural Networks as a Mixed-Integer Program [0.0]
The research focuses on training and evaluating proposed approaches through experiments on handwritten digit classification models.
The study assesses the performance of trained ReLU NNs, shedding light on the effectiveness of MIP formulations in enhancing training processes for NNs.
arXiv Detail & Related papers (2024-02-09T02:23:37Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Low-bit Quantization of Recurrent Neural Network Language Models Using
Alternating Direction Methods of Multipliers [67.688697838109]
This paper presents a novel method to train quantized RNNLMs from scratch using alternating direction methods of multipliers (ADMM)
Experiments on two tasks suggest the proposed ADMM quantization achieved a model size compression factor of up to 31 times over the full precision baseline RNNLMs.
arXiv Detail & Related papers (2021-11-29T09:30:06Z) - Exploiting Heterogeneity in Operational Neural Networks by Synaptic
Plasticity [87.32169414230822]
Recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs)
In this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the Synaptic Plasticity paradigm that poses the essential learning theory in biological neurons.
Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs.
arXiv Detail & Related papers (2020-08-21T19:03:23Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Self-Organized Operational Neural Networks with Generative Neurons [87.32169414230822]
ONNs are heterogenous networks with a generalized neuron model that can encapsulate any set of non-linear operators.
We propose Self-organized ONNs (Self-ONNs) with generative neurons that have the ability to adapt (optimize) the nodal operator of each connection.
arXiv Detail & Related papers (2020-04-24T14:37:56Z) - Res-CR-Net, a residual network with a novel architecture optimized for
the semantic segmentation of microscopy images [0.5363346028859919]
Res-CR-Net is a type of Deep Neural Network (DNN) that features residual blocks with either a bundle of separable atrous convolutions with different dilation rates or a convolutional LSTM.
The number of filters used in each residual block and the number of blocks are the only hyper parameters that need to be modified in order to optimize the network training for a variety of different microscopy images.
arXiv Detail & Related papers (2020-04-14T21:21:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.