A Dual-Dimer Method for Training Physics-Constrained Neural Networks
with Minimax Architecture
- URL: http://arxiv.org/abs/2005.00615v2
- Date: Fri, 1 Jan 2021 19:25:43 GMT
- Title: A Dual-Dimer Method for Training Physics-Constrained Neural Networks
with Minimax Architecture
- Authors: Dehao Liu, Yan Wang
- Abstract summary: The training of physics-constrained neural networks (PCNNs) is searched by a minimax search algorithm (PCNN-MM)
A novel saddle point algorithm called DualDimer is used to search the high-order saddle points of neural network data.
The convergence weights of PCNN-MMs is faster than that of traditional PCNNs.
- Score: 6.245537312562826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data sparsity is a common issue to train machine learning tools such as
neural networks for engineering and scientific applications, where experiments
and simulations are expensive. Recently physics-constrained neural networks
(PCNNs) were developed to reduce the required amount of training data. However,
the weights of different losses from data and physical constraints are adjusted
empirically in PCNNs. In this paper, a new physics-constrained neural network
with the minimax architecture (PCNN-MM) is proposed so that the weights of
different losses can be adjusted systematically. The training of the PCNN-MM is
searching the high-order saddle points of the objective function. A novel
saddle point search algorithm called Dual-Dimer method is developed. It is
demonstrated that the Dual-Dimer method is computationally more efficient than
the gradient descent ascent method for nonconvex-nonconcave functions and
provides additional eigenvalue information to verify search results. A heat
transfer example also shows that the convergence of PCNN-MMs is faster than
that of traditional PCNNs.
Related papers
- Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.
A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.
The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - Scalable Mechanistic Neural Networks for Differential Equations and Machine Learning [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences.
We reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear.
Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z) - RAMP-Net: A Robust Adaptive MPC for Quadrotors via Physics-informed
Neural Network [6.309365332210523]
We propose a Robust Adaptive MPC framework via PINNs (RAMP-Net), which uses a neural network trained partly from simple ODEs and partly from data.
We report 7.8% to 43.2% and 8.04% to 61.5% reduction in tracking errors for speeds ranging from 0.5 to 1.75 m/s compared to two SOTA regression based MPC methods.
arXiv Detail & Related papers (2022-09-19T16:11:51Z) - Low-Energy Convolutional Neural Networks (CNNs) using Hadamard Method [0.0]
Convolutional neural networks (CNNs) are a potential approach for object recognition and detection.
A new approach based on the Hadamard transformation as an alternative to the convolution operation is demonstrated.
The method is helpful for other computer vision tasks when the kernel size is smaller than the input image size.
arXiv Detail & Related papers (2022-09-06T21:36:57Z) - Lost Vibration Test Data Recovery Using Convolutional Neural Network: A
Case Study [0.0]
This paper proposes a CNN algorithm for the Alamosa Canyon Bridge as a real structure.
Three different CNN models were considered to predict one and two malfunctioned sensors.
The accuracy of the model was increased by adding a convolutional layer.
arXiv Detail & Related papers (2022-04-11T23:24:03Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - A quantum algorithm for training wide and deep classical neural networks [72.2614468437919]
We show that conditions amenable to classical trainability via gradient descent coincide with those necessary for efficiently solving quantum linear systems.
We numerically demonstrate that the MNIST image dataset satisfies such conditions.
We provide empirical evidence for $O(log n)$ training of a convolutional neural network with pooling.
arXiv Detail & Related papers (2021-07-19T23:41:03Z) - Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [75.69506249886622]
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments.
In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network.
arXiv Detail & Related papers (2021-02-08T05:55:47Z) - A Meta-Learning Approach to the Optimal Power Flow Problem Under
Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach.
The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z) - Training Deep Neural Networks with Constrained Learning Parameters [4.917317902787792]
A significant portion of deep learning tasks would run on edge computing systems.
We propose the Combinatorial Neural Network Training Algorithm (CoNNTrA)
CoNNTrA trains deep learning models with ternary learning parameters on the MNIST, Iris and ImageNet data sets.
Our results indicate that CoNNTrA models use 32x less memory and have errors at par with the Backpropagation models.
arXiv Detail & Related papers (2020-09-01T16:20:11Z) - Multi-fidelity Neural Architecture Search with Knowledge Distillation [69.09782590880367]
We propose a bayesian multi-fidelity method for neural architecture search: MF-KD.
Knowledge distillation adds to a loss function a term forcing a network to mimic some teacher network.
We show that training for a few epochs with such a modified loss function leads to a better selection of neural architectures than training for a few epochs with a logistic loss.
arXiv Detail & Related papers (2020-06-15T12:32:38Z) - Transfer learning based multi-fidelity physics informed deep neural
network [0.0]
The governing differential equation is either not known or known in an approximate sense.
This paper presents a novel multi-fidelity physics informed deep neural network (MF-PIDNN)
MF-PIDNN blends physics informed and data-driven deep learning techniques by using the concept of transfer learning.
arXiv Detail & Related papers (2020-05-19T13:57:48Z) - One-step regression and classification with crosspoint resistive memory
arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge.
One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition.
Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.