Related papers: A Quasilinear Algorithm for Computing Higher-Order Derivatives of Deep Feed-Forward Neural Networks

A Quasilinear Algorithm for Computing Higher-Order Derivatives of Deep Feed-Forward Neural Networks

URL: http://arxiv.org/abs/2412.09752v1
Date: Thu, 12 Dec 2024 22:57:28 GMT
Title: A Quasilinear Algorithm for Computing Higher-Order Derivatives of Deep Feed-Forward Neural Networks
Authors: Kyle R. Chickering,
Abstract summary: $n$-TangentProp computes the exact derivative $dn/dxn f(x)$ in quasilinear, instead of exponential time.<n>We demonstrate that our method is particularly beneficial in the context of physics-informed neural networks.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The use of neural networks for solving differential equations is practically difficult due to the exponentially increasing runtime of autodifferentiation when computing high-order derivatives. We propose $n$-TangentProp, the natural extension of the TangentProp formalism \cite{simard1991tangent} to arbitrarily many derivatives. $n$-TangentProp computes the exact derivative $d^n/dx^n f(x)$ in quasilinear, instead of exponential time, for a densely connected, feed-forward neural network $f$ with a smooth, parameter-free activation function. We validate our algorithm empirically across a range of depths, widths, and number of derivatives. We demonstrate that our method is particularly beneficial in the context of physics-informed neural networks where \ntp allows for significantly faster training times than previous methods and has favorable scaling with respect to both model size and loss-function complexity as measured by the number of required derivatives. The code for this paper can be found at https://github.com/kyrochi/n\_tangentprop.

Related papers

Solving Time-Fractional Partial Integro-Differential Equations Using Tensor Neural Network [0.0]
We propose a novel machine learning method based on adaptive tensor neural network subspace to solve linear time-fractional diffusion-wave equations. Some numerical examples are provided to validate the efficiency and accuracy of the proposed tensor neural network based machine learning method.
arXiv Detail & Related papers (2025-04-02T07:50:23Z)
Stochastic Taylor Derivative Estimator: Efficient amortization for arbitrary differential operators [29.063441432499776]
We show how to efficiently perform arbitrary contraction of the derivative tensor of arbitrary order for multivariate functions.<n>When applied to Physics-Informed Neural Networks (PINNs), our method provides >1000$times$ speed-up and.<n>30$times$ memory reduction over randomization with first-order AD.
arXiv Detail & Related papers (2024-11-27T09:37:33Z)
Adapting Newton's Method to Neural Networks through a Summary of Higher-Order Derivatives [0.0]
We focus on the exact and explicit computation of projections of the Hessian and higher-order derivatives on well-chosen subspaces. We propose an optimization method exploiting tensors at order 2 and 3 with several interesting properties.
arXiv Detail & Related papers (2023-12-06T20:24:05Z)
HOPE: High-order Polynomial Expansion of Black-box Neural Networks [7.156504968033132]
We introduce HOPE (High-order Polynomial Expansion), a method for expanding a network into a high-order Taylor on a reference input. Numerical analysis confirms the high accuracy, low computational complexity, and good convergence of the proposed method. We demonstrate HOPE's wide applications built on deep learning, including function discovery, fast inference, and feature selection.
arXiv Detail & Related papers (2023-07-17T01:46:15Z)
A Neural Network Ensemble Approach to System Identification [0.6445605125467573]
We present a new algorithm for learning unknown governing equations from trajectory data. We approximate the function $f$ using an ensemble of neural networks.
arXiv Detail & Related papers (2021-10-15T21:45:48Z)
Scaling Gaussian Processes with Derivative Information Using Variational Inference [17.746842802181256]
We introduce methods to achieve fully scalable Gaussian process regression with derivatives using variational inference. We demonstrate the full scalability of our approach on a variety of tasks, ranging from a high dimensional stellarator fusion regression task to training graph convolutional neural networks on Pubmed.
arXiv Detail & Related papers (2021-07-08T18:23:59Z)
Higher-order Derivatives of Weighted Finite-state Machines [68.43084108204741]
This work examines the computation of higher-order derivatives with respect to the normalization constant for weighted finite-state machines. We provide a general algorithm for evaluating derivatives of all orders, which has not been previously described in the literature. Our algorithm is significantly faster than prior algorithms.
arXiv Detail & Related papers (2021-06-01T19:51:55Z)
Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions. We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)
Random Vector Functional Link Networks for Function Approximation on Manifolds [8.535815777849786]
We show that single layer neural-networks with random input-to-hidden layer weights and biases have seen success in practice. We further adapt this randomized neural network architecture to approximate functions on smooth, compact submanifolds of Euclidean space.
arXiv Detail & Related papers (2020-07-30T23:50:44Z)
Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions. $Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z)
Multipole Graph Neural Operator for Parametric Partial Differential Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data. We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity. Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)
Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions. In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.