Nimble: Efficiently Compiling Dynamic Neural Networks for Model
Inference
- URL: http://arxiv.org/abs/2006.03031v2
- Date: Fri, 12 Mar 2021 07:20:49 GMT
- Title: Nimble: Efficiently Compiling Dynamic Neural Networks for Model
Inference
- Authors: Haichen Shen, Jared Roesch, Zhi Chen, Wei Chen, Yong Wu, Mu Li, Vin
Sharma, Zachary Tatlock, Yida Wang
- Abstract summary: This paper proposes Nimble, a high-performance and flexible system to optimize, compile, and execute dynamic neural networks on multiple platforms.
Our evaluation demonstrates that Nimble outperforms state-of-the-art deep learning frameworks and runtime systems for dynamic neural networks by up to 20x on hardware platforms.
- Score: 22.267489467486467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern deep neural networks increasingly make use of features such as dynamic
control flow, data structures and dynamic tensor shapes. Existing deep learning
systems focus on optimizing and executing static neural networks which assume a
pre-determined model architecture and input data shapes--assumptions which are
violated by dynamic neural networks. Therefore, executing dynamic models with
deep learning systems is currently both inflexible and sub-optimal, if not
impossible. Optimizing dynamic neural networks is more challenging than static
neural networks; optimizations must consider all possible execution paths and
tensor shapes. This paper proposes Nimble, a high-performance and flexible
system to optimize, compile, and execute dynamic neural networks on multiple
platforms. Nimble handles model dynamism by introducing a dynamic type system,
a set of dynamism-oriented optimizations, and a light-weight virtual machine
runtime. Our evaluation demonstrates that Nimble outperforms state-of-the-art
deep learning frameworks and runtime systems for dynamic neural networks by up
to 20x on hardware platforms including Intel CPUs, ARM CPUs, and Nvidia GPUs.
Related papers
- Systematic construction of continuous-time neural networks for linear dynamical systems [0.0]
We discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems.
We use a variant of continuous-time neural networks in which the output of each neuron evolves continuously as a solution of a first-order or second-order Ordinary Differential Equation (ODE)
Instead of deriving the network architecture and parameters from data, we propose a gradient-free algorithm to compute sparse architecture and network parameters directly from the given LTI system.
arXiv Detail & Related papers (2024-03-24T16:16:41Z) - Spyx: A Library for Just-In-Time Compiled Optimization of Spiking Neural
Networks [0.08965418284317034]
Spiking Neural Networks (SNNs) offer to enhance energy efficiency through a reduced and low-power hardware footprint.
This paper introduces Spyx, a new and lightweight SNN simulation and optimization library designed in JAX.
arXiv Detail & Related papers (2024-02-29T09:46:44Z) - Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences.
It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations.
Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z) - ConCerNet: A Contrastive Learning Based Framework for Automated
Conservation Law Discovery and Trustworthy Dynamical System Prediction [82.81767856234956]
This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling.
We show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics.
arXiv Detail & Related papers (2023-02-11T21:07:30Z) - HADAS: Hardware-Aware Dynamic Neural Architecture Search for Edge
Performance Scaling [8.29394286023338]
Dynamic neural networks (DyNNs) have become viable techniques to enable intelligence on resource-constrained edge devices.
In many cases, the implementation of DyNNs can be sub-optimal due to its underlying backbone architecture being developed at the design stage.
We present HADAS, a novel Hardware-Aware Dynamic Neural Architecture Search framework that realizes DyNN architectures whose backbone, early exiting features, and DVFS settings have been jointly optimized.
arXiv Detail & Related papers (2022-12-06T22:27:00Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Decomposed Linear Dynamical Systems (dLDS) for learning the latent
components of neural dynamics [6.829711787905569]
We propose a new decomposed dynamical system model that represents complex non-stationary and nonlinear dynamics of time series data.
Our model is trained through a dictionary learning procedure, where we leverage recent results in tracking sparse vectors over time.
In both continuous-time and discrete-time instructional examples we demonstrate that our model can well approximate the original system.
arXiv Detail & Related papers (2022-06-07T02:25:38Z) - Gradient-Based Trajectory Optimization With Learned Dynamics [80.41791191022139]
We use machine learning techniques to learn a differentiable dynamics model of the system from data.
We show that a neural network can model highly nonlinear behaviors accurately for large time horizons.
In our hardware experiments, we demonstrate that our learned model can represent complex dynamics for both the Spot and Radio-controlled (RC) car.
arXiv Detail & Related papers (2022-04-09T22:07:34Z) - A Survey on Dynamic Neural Networks for Natural Language Processing [13.949219077548687]
Dynamic neural networks are capable of scaling up neural networks with sub-linear increases in computation and time.
In this survey, we summarize progress of three types of dynamic neural networks in NLP: skimming, mixture of experts, and early exit.
arXiv Detail & Related papers (2022-02-15T00:13:05Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces.
We train and validate our approach directly on the Intel NNP-I chip for inference.
We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.