Related papers: Jacobian-Enhanced Neural Networks

Jacobian-Enhanced Neural Networks

URL: http://arxiv.org/abs/2406.09132v2
Date: Tue, 18 Jun 2024 02:15:18 GMT
Title: Jacobian-Enhanced Neural Networks
Authors: Steven H. Berguin,
Abstract summary: Jacobian-Enhanced Neural Networks (JENN) are densely connected multi-layer perceptrons. JENN's main benefit is better accuracy with fewer training points compared to standard neural networks.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Jacobian-Enhanced Neural Networks (JENN) are densely connected multi-layer perceptrons, whose training process is modified to predict partial derivatives accurately. Their main benefit is better accuracy with fewer training points compared to standard neural networks. These attributes are particularly desirable in the field of computer-aided design, where there is often the need to replace computationally expensive, physics-based models with fast running approximations, known as surrogate models or meta-models. Since a surrogate emulates the original model accurately in near-real time, it yields a speed benefit that can be used to carry out orders of magnitude more function calls quickly. However, in the special case of gradient-enhanced methods, there is the additional value proposition that partial derivatives are accurate, which is a critical property for one important use-case: surrogate-based optimization. This work derives the complete theory and exemplifies its superiority over standard neural nets for surrogate-based optimization.

Related papers

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training [67.45211108321203]
We introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer.<n>We show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs.
arXiv Detail & Related papers (2025-06-05T16:50:23Z)
Understanding Optimization in Deep Learning with Central Flows [53.66160508990508]
We show that an RMS's implicit behavior can be explicitly captured by a "central flow:" a differential equation. We show that these flows can empirically predict long-term optimization trajectories of generic neural networks.
arXiv Detail & Related papers (2024-10-31T17:58:13Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Guaranteed Approximation Bounds for Mixed-Precision Neural Operators [83.64404557466528]
We build on intuition that neural operator learning inherently induces an approximation error. We show that our approach reduces GPU memory usage by up to 50% and improves throughput by 58% with little or no reduction in accuracy.
arXiv Detail & Related papers (2023-07-27T17:42:06Z)
Physics Informed Piecewise Linear Neural Networks for Process Optimization [0.0]
It is proposed to upgrade piece-wise linear neural network models with physics informed knowledge for optimization problems with neural network models embedded. For all cases, physics-informed trained neural network based optimal results are closer to global optimality.
arXiv Detail & Related papers (2023-02-02T10:14:54Z)
Training Integer-Only Deep Recurrent Neural Networks [3.1829446824051195]
We present a quantization-aware training method for obtaining a highly accurate integer-only recurrent neural network (iRNN) Our approach supports layer normalization, attention, and an adaptive piecewise linear (PWL) approximation of activation functions. The proposed method enables RNN-based language models to run on edge devices with $2times$ improvement in runtime.
arXiv Detail & Related papers (2022-12-22T15:22:36Z)
A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive Coding Networks [65.34977803841007]
Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience. We show how by simply changing the temporal scheduling of the update rule for the synaptic weights leads to an algorithm that is much more efficient and stable than the original one.
arXiv Detail & Related papers (2022-11-16T00:11:04Z)
Precision Machine Learning [5.15188009671301]
We compare various function approximation methods and study how they scale with increasing parameters and data. We find that neural networks can often outperform classical approximation methods on high-dimensional examples. We develop training tricks which enable us to train neural networks to extremely low loss, close to the limits allowed by numerical precision.
arXiv Detail & Related papers (2022-10-24T17:58:30Z)
Towards Theoretically Inspired Neural Initialization Optimization [66.04735385415427]
We propose a differentiable quantity, named GradCosine, with theoretical insights to evaluate the initial state of a neural network. We show that both the training and test performance of a network can be improved by maximizing GradCosine under norm constraint. Generalized from the sample-wise analysis into the real batch setting, NIO is able to automatically look for a better initialization with negligible cost.
arXiv Detail & Related papers (2022-10-12T06:49:16Z)
DEBOSH: Deep Bayesian Shape Optimization [48.80431740983095]
We propose a novel uncertainty-based method tailored to shape optimization. It enables effective BO and increases the quality of the resulting shapes beyond that of state-of-the-art approaches.
arXiv Detail & Related papers (2021-09-28T11:01:42Z)
Enhanced data efficiency using deep neural networks and Gaussian processes for aerodynamic design optimization [0.0]
Adjoint-based optimization methods are attractive for aerodynamic shape design. They can become prohibitively expensive when multiple optimization problems are being solved. We propose a machine learning enabled, surrogate-based framework that replaces the expensive adjoint solver.
arXiv Detail & Related papers (2020-08-15T15:09:21Z)
An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices [58.62801151916888]
We introduce a new sparsity dimension, namely pattern-based sparsity that comprises pattern and connectivity sparsity, and becoming both highly accurate and hardware friendly. Our approach on the new pattern-based sparsity naturally fits into compiler optimization for highly efficient DNN execution on mobile platforms.
arXiv Detail & Related papers (2020-01-20T16:17:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.