Related papers: Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective

Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective

URL: http://arxiv.org/abs/2502.00604v1
Date: Sun, 02 Feb 2025 00:21:45 GMT
Title: Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective
Authors: Sifan Wang, Ananyae Kumar Bhartari, Bowen Li, Paris Perdikaris,
Abstract summary: We present theoretical and practical approaches for addressing directional conflicts between loss terms.<n>We show how these conflicts limit first-order methods and show that second-order optimization naturally resolves them.<n>We prove that SOAP, a recently proposed quasi-Newton method, efficiently approximates the Hessian preconditioner.
Score: 12.712238596012742
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Multi-task learning through composite loss functions is fundamental to modern deep learning, yet optimizing competing objectives remains challenging. We present new theoretical and practical approaches for addressing directional conflicts between loss terms, demonstrating their effectiveness in physics-informed neural networks (PINNs) where such conflicts are particularly challenging to resolve. Through theoretical analysis, we demonstrate how these conflicts limit first-order methods and show that second-order optimization naturally resolves them through implicit gradient alignment. We prove that SOAP, a recently proposed quasi-Newton method, efficiently approximates the Hessian preconditioner, enabling breakthrough performance in PINNs: state-of-the-art results on 10 challenging PDE benchmarks, including the first successful application to turbulent flows with Reynolds numbers up to 10,000, with 2-10x accuracy improvements over existing methods. We also introduce a novel gradient alignment score that generalizes cosine similarity to multiple gradients, providing a practical tool for analyzing optimization dynamics. Our findings establish frameworks for understanding and resolving gradient conflicts, with broad implications for optimization beyond scientific computing.

Related papers

Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Learning Provably Improves the Convergence of Gradient Descent [9.82454981262489]
We study the convergence of Learning to Optimize (L2O) problems by training-based solvers.<n>An algorithm's tangent significantly enhances L2O's convergence.<n>Our findings indicate 50% outperformance over the GD methods.
arXiv Detail & Related papers (2025-01-30T02:03:30Z)
Which Optimizer Works Best for Physics-Informed Neural Networks and Kolmogorov-Arnold Networks? [1.8175282137722093]
Physics-In Arnold Neural Networks (PINNs) have revolutionized the computation of partial differential equations (PDEs)<n>These PINNs integrate PDEs into the neural network's training process as soft constraints.
arXiv Detail & Related papers (2025-01-22T21:19:42Z)
Component-based Sketching for Deep ReLU Nets [55.404661149594375]
We develop a sketching scheme based on deep net components for various tasks. We transform deep net training into a linear empirical risk minimization problem. We show that the proposed component-based sketching provides almost optimal rates in approximating saturated functions.
arXiv Detail & Related papers (2024-09-21T15:30:43Z)
Unveiling the optimization process of Physics Informed Neural Networks: How accurate and competitive can PINNs be? [0.0]
This study investigates the potential accuracy of physics-informed neural networks, contrasting their approach with previous similar works and traditional numerical methods.<n>We find that selecting improved optimization algorithms significantly enhances the accuracy of the results.<n>Simple modifications to the loss function may also improve precision, offering an additional avenue for enhancement.
arXiv Detail & Related papers (2024-05-07T11:50:25Z)
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods. Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Half-Inverse Gradients for Physical Deep Learning [25.013244956897832]
Integrating differentiable physics simulators into the training process can greatly improve the quality of results. The gradient-based solvers have a profound effect on the gradient flow as manipulating scales in magnitude and direction is an inherent property of many physical processes. In this work, we analyze the characteristics of both physical and neural network optimizations to derive a new method that does not suffer from this phenomenon.
arXiv Detail & Related papers (2022-03-18T19:11:04Z)
A theoretical and empirical study of new adaptive algorithms with additional momentum steps and shifted updates for stochastic non-convex optimization [0.0]
It is thought that adaptive optimization algorithms represent the key pillar behind the of the Learning field. In this paper we introduce adaptive momentum techniques for different non-smooth objective problems.
arXiv Detail & Related papers (2021-10-16T09:47:57Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability. Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network. Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.