Related papers: Scalable Neural Tangent Kernel of Recurrent Architectures

Scalable Neural Tangent Kernel of Recurrent Architectures

URL: http://arxiv.org/abs/2012.04859v1
Date: Wed, 9 Dec 2020 04:36:34 GMT
Title: Scalable Neural Tangent Kernel of Recurrent Architectures
Authors: Sina Alemohammad, Randall Balestriero, Zichao Wang, Richard Baraniuk
Abstract summary: Kernels derived from deep neural networks (DNNs) in the infinite-width provide high performance in a range of machine learning tasks. We extend the family of kernels associated with recurrent neural networks (RNNs) to more complex architectures that are bidirectional RNNs and RNNs with average pooling.
Score: 8.487185704099923
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Kernels derived from deep neural networks (DNNs) in the infinite-width provide not only high performance in a range of machine learning tasks but also new theoretical insights into DNN training dynamics and generalization. In this paper, we extend the family of kernels associated with recurrent neural networks (RNNs), which were previously derived only for simple RNNs, to more complex architectures that are bidirectional RNNs and RNNs with average pooling. We also develop a fast GPU implementation to exploit its full practical potential. While RNNs are typically only applied to time-series data, we demonstrate that classifiers using RNN-based kernels outperform a range of baseline methods on 90 non-time-series datasets from the UCI data repository.

Related papers

Scalable Mechanistic Neural Networks [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences. By reformulating the original Mechanistic Neural Network (MNN) we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear. Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z)
Investigating Sparsity in Recurrent Neural Networks [0.0]
This thesis focuses on investigating the effects of pruning and Sparse Recurrent Neural Networks on the performance of RNNs. We first describe the pruning of RNNs, its impact on the performance of RNNs, and the number of training epochs required to regain accuracy after the pruning is performed. Next, we continue with the creation and training of Sparse Recurrent Neural Networks and identify the relation between the performance and the graph properties of its underlying arbitrary structure.
arXiv Detail & Related papers (2024-07-30T07:24:58Z)
Do Not Train It: A Linear Neural Architecture Search of Graph Neural Networks [15.823247346294089]
We develop a novel NAS-GNNs method, namely neural architecture coding (NAC) Our approach leads to state-of-the-art performance, which is up to $200times$ faster and $18.8%$ more accurate than the strong baselines.
arXiv Detail & Related papers (2023-05-23T13:44:04Z)
Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime. We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp) In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks. We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z)
Comparative Analysis of Interval Reachability for Robust Implicit and Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs) INNs are a class of implicit learning models that use implicit equations as layers. We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z)
Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks [72.81092567651395]
Sub-bit Neural Networks (SNNs) are a new type of binary quantization design tailored to compress and accelerate BNNs. SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space. Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs.
arXiv Detail & Related papers (2021-10-18T11:30:29Z)
Going Deeper With Directly-Trained Larger Spiking Neural Networks [20.40894876501739]
Spiking neural networks (SNNs) are promising in coding for bio-usible information and event-driven signal processing. However, the unique working mode of SNNs makes them more difficult to train than traditional networks. We propose a CIF-dependent batch normalization (tpladBN) method based on the emerging-temporal backproation threshold.
arXiv Detail & Related papers (2020-10-29T07:15:52Z)
The Recurrent Neural Tangent Kernel [11.591070761599328]
We introduce and study the Recurrent Neural Tangent Kernel (RNTK), which provides new insights into the behavior of overparametrized RNNs. A synthetic and 56 real-world data experiments demonstrate that the RNTK offers significant performance gains over other kernels.
arXiv Detail & Related papers (2020-06-18T02:59:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.