Related papers: HiPreNets: High-Precision Neural Networks through Progressive Training

HiPreNets: High-Precision Neural Networks through Progressive Training

URL: http://arxiv.org/abs/2506.15064v2
Date: Tue, 29 Jul 2025 15:25:46 GMT
Title: HiPreNets: High-Precision Neural Networks through Progressive Training
Authors: Ethan Mulle, Wei Kang, Qi Gong,
Abstract summary: We present a framework for tuning and high-precision neural networks (HiPreNets)<n>Our approach refines a previously explored staged training technique for neural networks.<n>We discuss how to take advantage of the structure of the residuals to guide the choice loss function number parameters to use.
Score: 1.5429976366871665
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks are powerful tools for solving nonlinear problems in science and engineering, but training highly accurate models becomes challenging as problem complexity increases. Non-convex optimization and numerous hyperparameters to tune make performance improvement difficult, and traditional approaches often prioritize minimizing mean squared error (MSE) while overlooking $L^{\infty}$ error, which is the critical focus in many applications. To address these challenges, we present a progressive framework for training and tuning high-precision neural networks (HiPreNets). Our approach refines a previously explored staged training technique for neural networks that improves an existing fully connected neural network by sequentially learning its prediction residuals using additional networks, leading to improved overall accuracy. We discuss how to take advantage of the structure of the residuals to guide the choice of loss function, number of parameters to use, and ways to introduce adaptive data sampling techniques. We validate our framework's effectiveness through several benchmark problems.

Related papers

Optimization over Trained (and Sparse) Neural Networks: A Surrogate within a Surrogate [5.718611740982005]
We use network pruning to produce a surrogate of the neural network itself.<n>We show that a pruned network with bad classification performance can still be a good - and more efficient - surrogate.
arXiv Detail & Related papers (2025-05-04T04:49:19Z)
Improving Generalization of Deep Neural Networks by Optimum Shifting [33.092571599896814]
We propose a novel method called emphoptimum shifting, which changes the parameters of a neural network from a sharp minimum to a flatter one. Our method is based on the observation that when the input and output of a neural network are fixed, the matrix multiplications within the network can be treated as systems of under-determined linear equations.
arXiv Detail & Related papers (2024-05-23T02:31:55Z)
Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process. In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture. We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z)
VNN: Verification-Friendly Neural Networks with Hard Robustness Guarantees [3.208888890455612]
We propose a novel framework to generate Verification-Friendly Neural Networks (VNNs) We present a post-training optimization framework to achieve a balance between preserving prediction performance and verification-friendliness.
arXiv Detail & Related papers (2023-12-15T12:39:27Z)
Globally Optimal Training of Neural Networks with Threshold Activation Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations. We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z)
Precision Machine Learning [5.15188009671301]
We compare various function approximation methods and study how they scale with increasing parameters and data. We find that neural networks can often outperform classical approximation methods on high-dimensional examples. We develop training tricks which enable us to train neural networks to extremely low loss, close to the limits allowed by numerical precision.
arXiv Detail & Related papers (2022-10-24T17:58:30Z)
Imbedding Deep Neural Networks [0.0]
Continuous depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems. We propose a new approach which explicates the network's depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems.
arXiv Detail & Related papers (2022-01-31T22:00:41Z)
Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks. With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z)
Efficient and Sparse Neural Networks by Pruning Weights in a Multiobjective Learning Approach [0.0]
We propose a multiobjective perspective on the training of neural networks by treating its prediction accuracy and the network complexity as two individual objective functions. Preliminary numerical results on exemplary convolutional neural networks confirm that large reductions in the complexity of neural networks with neglibile loss of accuracy are possible.
arXiv Detail & Related papers (2020-08-31T13:28:03Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices. The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network. We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z)
MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. The use of gradient combined nonvolutionity renders learning susceptible to novel problems. We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.