Less Emphasis on Difficult Layer Regions: Curriculum Learning for
Singularly Perturbed Convection-Diffusion-Reaction Problems
- URL: http://arxiv.org/abs/2210.12685v2
- Date: Fri, 24 Mar 2023 10:09:01 GMT
- Title: Less Emphasis on Difficult Layer Regions: Curriculum Learning for
Singularly Perturbed Convection-Diffusion-Reaction Problems
- Authors: Yufeng Wang, Cong Xu, Min Yang, Jin Zhang
- Abstract summary: We show that learning multi-scale fields simultaneously makes the network unable to advance its training and easily get stuck in poor local minima.
We propose a novel curriculum learning method that encourages neural networks to prioritize learning on easier non-layer regions while downplaying learning on harder layer regions.
- Score: 21.615494601195472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although Physics-Informed Neural Networks (PINNs) have been successfully
applied in a wide variety of science and engineering fields, they can fail to
accurately predict the underlying solution in slightly challenging
convection-diffusion-reaction problems. In this paper, we investigate the
reason of this failure from a domain distribution perspective, and identify
that learning multi-scale fields simultaneously makes the network unable to
advance its training and easily get stuck in poor local minima. We show that
the widespread experience of sampling more collocation points in high-loss
layer regions hardly help optimize and may even worsen the results. These
findings motivate the development of a novel curriculum learning method that
encourages neural networks to prioritize learning on easier non-layer regions
while downplaying learning on harder layer regions. The proposed method helps
PINNs automatically adjust the learning emphasis and thereby facilitate the
optimization procedure. Numerical results on typical benchmark equations show
that the proposed curriculum learning approach mitigates the failure modes of
PINNs and can produce accurate results for very sharp boundary and interior
layers. Our work reveals that for equations whose solutions have large scale
differences, paying less attention to high-loss regions can be an effective
strategy for learning them accurately.
Related papers
- Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation [70.43845294145714]
Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic.
We propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules.
Our method can be integrated into both local-BP and BP-free settings.
arXiv Detail & Related papers (2024-06-07T19:10:31Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Adaptive Self-supervision Algorithms for Physics-informed Neural
Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function.
We study the impact of the location of the collocation points on the trainability of these models.
We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Cascaded Compressed Sensing Networks: A Reversible Architecture for
Layerwise Learning [11.721183551822097]
We show that target propagation could be achieved by modeling the network s each layer with compressed sensing, without the need of auxiliary networks.
Experiments show that the proposed method could achieve better performance than the auxiliary network-based method.
arXiv Detail & Related papers (2021-10-20T05:21:13Z) - Characterizing possible failure modes in physics-informed neural
networks [55.83255669840384]
Recent work in scientific machine learning has developed so-called physics-informed neural network (PINN) models.
We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena even for simple PDEs.
We show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN's setup makes the loss landscape very hard to optimize.
arXiv Detail & Related papers (2021-09-02T16:06:45Z) - AlterSGD: Finding Flat Minima for Continual Learning by Alternative
Training [11.521519687645428]
We propose a simple yet effective optimization method, called AlterSGD, to search for a flat minima in the loss landscape.
We prove that such a strategy can encourage the optimization to converge to a flat minima.
We verify AlterSGD on continual learning benchmark for semantic segmentation and the empirical results show that we can significantly mitigate the forgetting.
arXiv Detail & Related papers (2021-07-13T01:43:51Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Why Layer-Wise Learning is Hard to Scale-up and a Possible Solution via
Accelerated Downsampling [19.025707054206457]
Layer-wise learning can achieve state-of-the-art performance in image classification on various datasets.
Previous studies of layer-wise learning are limited to networks with simple hierarchical structures.
This paper reveals the fundamental reason that impedes the scale-up of layer-wise learning is due to the relatively poor separability of the feature space in shallow layers.
arXiv Detail & Related papers (2020-10-15T21:51:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.