Pairwise Neural Networks (PairNets) with Low Memory for Fast On-Device
Applications
- URL: http://arxiv.org/abs/2002.04458v1
- Date: Mon, 10 Feb 2020 02:12:59 GMT
- Title: Pairwise Neural Networks (PairNets) with Low Memory for Fast On-Device
Applications
- Authors: Luna M. Zhang
- Abstract summary: A traditional artificial neural network (ANN) is normally trained slowly by a gradient descent algorithm, such as the backpropagation algorithm.
We created a novel wide and shallow 4-layer ANN called "Pairwise Neural Network" ("PairNet") with high-speed non-gradient-descent hyper parameter optimization.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A traditional artificial neural network (ANN) is normally trained slowly by a
gradient descent algorithm, such as the backpropagation algorithm, since a
large number of hyperparameters of the ANN need to be fine-tuned with many
training epochs. Since a large number of hyperparameters of a deep neural
network, such as a convolutional neural network, occupy much memory, a
memory-inefficient deep learning model is not ideal for real-time Internet of
Things (IoT) applications on various devices, such as mobile phones. Thus, it
is necessary to develop fast and memory-efficient Artificial Intelligence of
Things (AIoT) systems for real-time on-device applications. We created a novel
wide and shallow 4-layer ANN called "Pairwise Neural Network" ("PairNet") with
high-speed non-gradient-descent hyperparameter optimization. The PairNet is
trained quickly with only one epoch since its hyperparameters are directly
optimized one-time via simply solving a system of linear equations by using the
multivariate least squares fitting method. In addition, an n-input space is
partitioned into many n-input data subspaces, and a local PairNet is built in a
local n-input subspace. This divide-and-conquer approach can train the local
PairNet using specific local features to improve model performance. Simulation
results indicate that the three PairNets with incremental learning have smaller
average prediction mean squared errors, and achieve much higher speeds than
traditional ANNs. An important future work is to develop better and faster
non-gradient-descent hyperparameter optimization algorithms to generate
effective, fast, and memory-efficient PairNets with incremental learning on
optimal subspaces for real-time AIoT on-device applications.
Related papers
- Scaling Studies for Efficient Parameter Search and Parallelism for Large
Language Model Pre-training [2.875838666718042]
We focus on parallel and distributed machine learning algorithm development, specifically for optimizing the data processing and pre-training of a set of 5 encoder-decoder LLMs.
We performed a fine-grained study to quantify the relationships between three ML methods, specifically exploring Microsoft DeepSpeed Zero Redundancy stages.
arXiv Detail & Related papers (2023-10-09T02:22:00Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Hyperparameter Optimization with Neural Network Pruning [6.193231258199234]
We propose a proxy model for a neural network (N_B) to be used for hyperparameter optimization.
The proposed framework can reduce the amount of time up to 37%.
arXiv Detail & Related papers (2022-05-18T02:51:47Z) - FPGA-based AI Smart NICs for Scalable Distributed AI Training Systems [62.20308752994373]
We propose a new smart network interface card (NIC) for distributed AI training systems using field-programmable gate arrays (FPGAs)
Our proposed FPGA-based AI smart NIC enhances overall training performance by 1.6x at 6 nodes, with an estimated 2.5x performance improvement at 32 nodes, compared to the baseline system using conventional NICs.
arXiv Detail & Related papers (2022-04-22T21:57:00Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Online hyperparameter optimization by real-time recurrent learning [57.01871583756586]
Our framework takes advantage of the analogy between hyperparameter optimization and parameter learning in neural networks (RNNs)
It adapts a well-studied family of online learning algorithms for RNNs to tune hyperparameters and network parameters simultaneously.
This procedure yields systematically better generalization performance compared to standard methods, at a fraction of wallclock time.
arXiv Detail & Related papers (2021-02-15T19:36:18Z) - NL-CNN: A Resources-Constrained Deep Learning Model based on Nonlinear
Convolution [0.0]
A novel convolution neural network model, abbreviated NL-CNN, is proposed, where nonlinear convolution is emulated in a cascade of convolution + nonlinearity layers.
Performance evaluation for several widely known datasets is provided, showing several relevant features.
arXiv Detail & Related papers (2021-01-30T13:38:42Z) - Optimizing Memory Placement using Evolutionary Graph Reinforcement
Learning [56.83172249278467]
We introduce Evolutionary Graph Reinforcement Learning (EGRL), a method designed for large search spaces.
We train and validate our approach directly on the Intel NNP-I chip for inference.
We additionally achieve 28-78% speed-up compared to the native NNP-I compiler on all three workloads.
arXiv Detail & Related papers (2020-07-14T18:50:12Z) - TASO: Time and Space Optimization for Memory-Constrained DNN Inference [5.023660118588569]
Convolutional neural networks (CNNs) are used in many embedded applications, from industrial robotics and automation systems to biometric identification on mobile devices.
We propose an approach for ahead-of-time domain specific optimization of CNN models, based on an integer linear programming (ILP) for selecting primitive operations to implement convolutional layers.
arXiv Detail & Related papers (2020-05-21T15:08:06Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - PairNets: Novel Fast Shallow Artificial Neural Networks on Partitioned
Subspaces [0.0]
We create a novel shallow 4-layer ANN called "Pairwise Neural Network" ("PairNet")
A value of each input is partitioned into multiple intervals, and then an n-dimensional space is partitioned into M n-dimensional subspaces.
M local PairNets are built in M partitioned local n-dimensional subspaces.
arXiv Detail & Related papers (2020-01-24T05:23:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.