Complexity Dependent Error Rates for Physics-informed Statistical Learning via the Small-ball Method
- URL: http://arxiv.org/abs/2510.23149v1
- Date: Mon, 27 Oct 2025 09:26:07 GMT
- Title: Complexity Dependent Error Rates for Physics-informed Statistical Learning via the Small-ball Method
- Authors: Diego Marcondes,
- Abstract summary: Physics-informed statistical learning (PISL) integrates empirical data with physical knowledge to enhance the statistical performance of estimators.<n>This work establishes a theoretical framework for evaluating the statistical properties of physics-informed estimators in convex classes of functions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Physics-informed statistical learning (PISL) integrates empirical data with physical knowledge to enhance the statistical performance of estimators. While PISL methods are widely used in practice, a comprehensive theoretical understanding of how informed regularization affects statistical properties is still missing. Specifically, two fundamental questions have yet to be fully addressed: (1) what is the trade-off between considering soft penalties versus hard constraints, and (2) what is the statistical gain of incorporating physical knowledge compared to purely data-driven empirical error minimisation. In this paper, we address these questions for PISL in convex classes of functions under physical knowledge expressed as linear equations by developing appropriate complexity dependent error rates based on the small-ball method. We show that, under suitable assumptions, (1) the error rates of physics-informed estimators are comparable to those of hard constrained empirical error minimisers, differing only by constant terms, and that (2) informed penalization can effectively reduce model complexity, akin to dimensionality reduction, thereby improving learning performance. This work establishes a theoretical framework for evaluating the statistical properties of physics-informed estimators in convex classes of functions, contributing to closing the gap between statistical theory and practical PISL, with potential applications to cases not yet explored in the literature.
Related papers
- Uncertainty-Aware Data-Efficient AI: An Information-Theoretic Perspective [48.073471560778984]
In context-specific applications such as robotics, telecommunications, and healthcare, artificial intelligence systems often face the challenge of limited training data.<n>This review paper examines formal methodologies that address data-limited regimes through two complementary approaches.
arXiv Detail & Related papers (2025-12-04T21:44:22Z) - Machine-Learning-Assisted Comparison of Regression Functions [6.536054952579518]
We revisit the classical problem of comparing regression functions, a fundamental question in statistical inference.<n>We propose a new notion of kernel-based conditional mean dependence that provides a new characterization of the null hypothesis of equal regression functions.<n>We develop two novel tests that leverage modern machine learning methods for flexible estimation.
arXiv Detail & Related papers (2025-10-28T17:59:15Z) - Do-PFN: In-Context Learning for Causal Effect Estimation [75.62771416172109]
We show that Prior-data fitted networks (PFNs) can be pre-trained on synthetic data to predict outcomes.<n>Our approach allows for the accurate estimation of causal effects without knowledge of the underlying causal graph.
arXiv Detail & Related papers (2025-06-06T12:43:57Z) - Physics-informed machine learning as a kernel method [7.755962782612672]
We consider a general regression problem where the empirical risk is regularized by a partial differential equation.
Taking advantage of kernel theory, we derive convergence rates for the minimizer of the regularized risk.
We show that faster rates can be achieved, depending on the physical error.
arXiv Detail & Related papers (2024-02-12T09:38:42Z) - Trade-off Between Dependence and Complexity for Nonparametric Learning
-- an Empirical Process Approach [10.27974860479791]
In many applications where the data exhibit temporal dependencies, the corresponding empirical processes are much less understood.
We present a general bound on the expected supremum of empirical processes under standard $beta/rho$-mixing assumptions.
We show that even under long-range dependence, it is possible to attain the same rates as in the i.i.d. setting.
arXiv Detail & Related papers (2024-01-17T05:08:37Z) - Weak Supervision Performance Evaluation via Partial Identification [46.73061437177238]
Programmatic Weak Supervision (PWS) enables supervised model training without direct access to ground truth labels.
We present a novel method to address this challenge by framing model evaluation as a partial identification problem.
Our approach derives reliable bounds on key metrics without requiring labeled data, overcoming core limitations in current weak supervision evaluation techniques.
arXiv Detail & Related papers (2023-12-07T07:15:11Z) - Hypothesis Transfer Learning with Surrogate Classification Losses:
Generalization Bounds through Algorithmic Stability [3.908842679355255]
Hypothesis transfer learning (HTL) contrasts domain adaptation by allowing for a previous task leverage, named the source, into a new one, the target.
This paper studies the learning theory of HTL through algorithmic stability, an attractive theoretical framework for machine learning algorithms analysis.
arXiv Detail & Related papers (2023-05-31T09:38:21Z) - Measuring Causal Effects of Data Statistics on Language Model's
`Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models.
We provide a language for describing how training data influences predictions, through a causal framework.
Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z) - Applying physics-based loss functions to neural networks for improved
generalizability in mechanics problems [3.655021726150368]
Informed Machine Learning (PIML) has gained momentum in the last 5 years with scientists and researchers to utilize the benefits afforded by advances in machine learning.
In this work a new approach to utilizing PIML is discussed that deals with the use of physics-based loss functions.
arXiv Detail & Related papers (2021-04-30T20:31:09Z) - Constrained Learning with Non-Convex Losses [119.8736858597118]
Though learning has become a core technology of modern information processing, there is now ample evidence that it can lead to biased, unsafe, and prejudiced solutions.
arXiv Detail & Related papers (2021-03-08T23:10:33Z) - The empirical duality gap of constrained statistical learning [115.23598260228587]
We study the study of constrained statistical learning problems, the unconstrained version of which are at the core of virtually all modern information processing.
We propose to tackle the constrained statistical problem overcoming its infinite dimensionality, unknown distributions, and constraints by leveraging finite dimensional parameterizations, sample averages, and duality theory.
We demonstrate the effectiveness and usefulness of this constrained formulation in a fair learning application.
arXiv Detail & Related papers (2020-02-12T19:12:29Z) - Localized Debiased Machine Learning: Efficient Inference on Quantile
Treatment Effects and Beyond [69.83813153444115]
We consider an efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference.
Debiased machine learning (DML) is a data-splitting approach to estimating high-dimensional nuisances.
We propose localized debiased machine learning (LDML), which avoids this burdensome step.
arXiv Detail & Related papers (2019-12-30T14:42:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.