Related papers: Overcoming the Loss Conditioning Bottleneck in Optimization-Based PDE Solvers: A Novel Well-Conditioned Loss Function

Overcoming the Loss Conditioning Bottleneck in Optimization-Based PDE Solvers: A Novel Well-Conditioned Loss Function

URL: http://arxiv.org/abs/2508.02692v1
Date: Thu, 24 Jul 2025 10:17:02 GMT
Title: Overcoming the Loss Conditioning Bottleneck in Optimization-Based PDE Solvers: A Novel Well-Conditioned Loss Function
Authors: Wenbo Cao, Weiwei Zhang,
Abstract summary: PDE solvers that minimize scalar loss functions have gained increasing attention in recent years.<n>Such methods converge much more slowly than classical iterative solvers and are commonly regarded as inefficient.<n>This work provides a theoretical insight, attributing the inefficiency to the use of the mean squared error (MSE) loss.<n>By tuning a weight parameter, it flexibly modulates the condition number between the original system and its normal equations, while reducing to the MSE loss in the limiting case.
Score: 1.6135205846394396
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Optimization-based PDE solvers that minimize scalar loss functions have gained increasing attention in recent years. These methods either define the loss directly over discrete variables, as in Optimizing a Discrete Loss (ODIL), or indirectly through a neural network surrogate, as in Physics-Informed Neural Networks (PINNs). However, despite their promise, such methods often converge much more slowly than classical iterative solvers and are commonly regarded as inefficient. This work provides a theoretical insight, attributing the inefficiency to the use of the mean squared error (MSE) loss, which implicitly forms the normal equations, squares the condition number, and severely impairs optimization. To address this, we propose a novel Stabilized Gradient Residual (SGR) loss. By tuning a weight parameter, it flexibly modulates the condition number between the original system and its normal equations, while reducing to the MSE loss in the limiting case. We systematically benchmark the convergence behavior and optimization stability of the SGR loss within both the ODIL framework and PINNs-employing either numerical or automatic differentiation-and compare its performance against classical iterative solvers. Numerical experiments on a range of benchmark problems demonstrate that, within the ODIL framework, the proposed SGR loss achieves orders-of-magnitude faster convergence than the MSE loss. Further validation within the PINNs framework shows that, despite the high nonlinearity of neural networks, SGR consistently outperforms the MSE loss. These theoretical and empirical findings help bridge the performance gap between classical iterative solvers and optimization-based solvers, highlighting the central role of loss conditioning, and provide key insights for the design of more efficient PDE solvers.

Related papers

Matrix Sensing with Kernel Optimal Loss: Robustness and Optimization Landscape [10.674539579679871]
In traditional regression tasks, mean squared error (MSE) loss is a common choice, but it can be unreliable non-Gaussian or heavy-tailed noise.<n>We adopt a robust loss formulation based on a kernel-based estimate of the residual density and maximize the estimated log-likelihood.
arXiv Detail & Related papers (2025-11-03T23:22:37Z)
Graph Neural Regularizers for PDE Inverse Problems [62.49743146797144]
We present a framework for solving a broad class of ill-posed inverse problems governed by partial differential equations (PDEs)<n>The forward problem is numerically solved using the finite element method (FEM)<n>We employ physics-inspired graph neural networks as learned regularizers, providing a robust, interpretable, and generalizable alternative to standard approaches.
arXiv Detail & Related papers (2025-10-23T21:43:25Z)
AutoBalance: An Automatic Balancing Framework for Training Physics-Informed Neural Networks [10.223108587188808]
PINNs provide a powerful and general framework for solving Partial Differential Equations.<n> PINNs balance multiple loss terms, such as PDE residuals and boundary conditions, which often have conflicting objectives and vastly different curvatures.<n>Existing methods address this issue by manipulating gradients before optimization.<n>We introduce AutoBalance, a novel "post-combine" training paradigm.
arXiv Detail & Related papers (2025-10-08T06:13:03Z)
Convolution-weighting method for the physics-informed neural network: A Primal-Dual Optimization Perspective [14.65008276932511]
Physics-informed neural networks (PINNs) are extensively employed to solve partial differential equations (PDEs)<n>PINNs are typically optimized using a finite set of points, which poses significant challenges in guaranteeing their convergence and accuracy.<n>We propose a new weighting scheme that will adaptively change the weights to the loss functions from isolated points to their continuous neighborhood regions.
arXiv Detail & Related papers (2025-06-24T17:13:51Z)
NDCG-Consistent Softmax Approximation with Accelerated Convergence [67.10365329542365]
We propose novel loss formulations that align directly with ranking metrics.<n>We integrate the proposed RG losses with the highly efficient Alternating Least Squares (ALS) optimization method.<n> Empirical evaluations on real-world datasets demonstrate that our approach achieves comparable or superior ranking performance.
arXiv Detail & Related papers (2025-06-11T06:59:17Z)
The Larger the Merrier? Efficient Large AI Model Inference in Wireless Edge Networks [56.37880529653111]
The demand for large computation model (LAIM) services is driving a paradigm shift from traditional cloud-based inference to edge-based inference for low-latency, privacy-preserving applications.<n>In this paper, we investigate the LAIM-inference scheme, where a pre-trained LAIM is pruned and partitioned into on-device and on-server sub-models for deployment.
arXiv Detail & Related papers (2025-05-14T08:18:55Z)
Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Dual Cone Gradient Descent for Training Physics-Informed Neural Networks [0.0]
Physics-informed dual neural networks (PINNs) have emerged as a prominent approach for solving partial differential equations.<n>We propose a novel framework, Dual Cone Gradient Descent (DCGD), which adjusts the direction of the updated gradient to ensure it falls within a cone region.
arXiv Detail & Related papers (2024-09-27T03:27:46Z)
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.<n>To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.<n>Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z)
A Deep Unrolling Model with Hybrid Optimization Structure for Hyperspectral Image Deconvolution [50.13564338607482]
We propose a novel optimization framework for the hyperspectral deconvolution problem, called DeepMix.<n>It consists of three distinct modules, namely, a data consistency module, a module that enforces the effect of the handcrafted regularizers, and a denoising module.<n>This work proposes a context aware denoising module designed to sustain the advancements achieved by the cooperative efforts of the other modules.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Enhanced Physics-Informed Neural Networks with Augmented Lagrangian Relaxation Method (AL-PINNs) [1.7403133838762446]
Physics-Informed Neural Networks (PINNs) are powerful approximators of solutions to nonlinear partial differential equations (PDEs) We propose an Augmented Lagrangian relaxation method for PINNs (AL-PINNs) We demonstrate through various numerical experiments that AL-PINNs yield a much smaller relative error compared with that of state-of-the-art adaptive loss-balancing algorithms.
arXiv Detail & Related papers (2022-04-29T08:33:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.