Overview of Physics-Informed Machine Learning Inversion of Geophysical
Data
- URL: http://arxiv.org/abs/2310.08109v1
- Date: Thu, 12 Oct 2023 08:10:31 GMT
- Title: Overview of Physics-Informed Machine Learning Inversion of Geophysical
Data
- Authors: Gerard T. Schuster and Shihang Feng
- Abstract summary: We review four types of algorithms for physics-informed machine learning (PIML) inversion of geophysical data.
The selection of weights $lambda_i$ and the NN operations determine one of four different PIML algorithms.
- Score: 1.5108390071719444
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We review four types of algorithms for physics-informed machine learning
(PIML) inversion of geophysical data. The unifying equation is given by the
joint objective function $\epsilon$:
\begin{eqnarray} \epsilon^{||-PIML}&=&\lambda_1 \overbrace{||{\bf
W}^{ML}({\bf H}_{{\bf w}} {\bf d}^{obs}-{\bf m})||^2}^{NN} + \lambda_2
\overbrace{{||{\bf W}^{FWI}({\bf L} {\bf m}-{\bf d}^{obs})||^2}}^{FWI} ~+
\nonumber\\ \nonumber\\ && + ~~Regularizer, \label{PIML.eq120}
\end{eqnarray}where the optimal model ${\bf m}^*$ and weights $\bf w^*$
minimize $\epsilon$. Here, The matrix weights are given by the boldface symbol
$\bf W$, and full waveform inversion (FWI) is typically computed using a
finite-difference solution of the wave equation, where $\bf L$ represents the
forward modeling operation of the wave equation as a function of the model $\bf
m$. Also, a fully-connected neural network (NN) is used to compute the model
${\bf H_w}{\bf d}^{obs} \approx \bf m$ from the observed input data ${\bf
d}^{obs}$. The selection of weights $\lambda_i$ and the NN operations determine
one of four different PIML algorithms.
PIML offers potential advantages over standard FWI through its enhanced
ability to avoid local minima and the option to locally train the inversion
operator, minimizing the requirement for extensive training data for global
applicability. However, the effectiveness of PIML relies on the similarity
between the test and trained data. Nevertheless, a possible strategy to
overcome this limitation involves initial pretraining of a PIML architecture
with data from a broader region, followed by fine-tuning for specific data-a
method reminiscent of the way large language models are pretrained and adapted
for various tasks.
Related papers
- VAE-DNN: Energy-Efficient Trainable-by-Parts Surrogate Model For Parametric Partial Differential Equations [49.1574468325115]
We propose a trainable-by-parts surrogate model for solving forward and inverse parameterized nonlinear partial differential equations.<n>The proposed approach employs an encoder to reduce the high-dimensional input $y(bmx)$ to a lower-dimensional latent space, $bmmu_bmphi_y$.<n>A fully connected neural network is used to map $bmmu_bmphi_y$ to the latent space, $bmmu_bmphi_h$, of the P
arXiv Detail & Related papers (2025-08-05T18:37:32Z) - Eliciting Fine-Tuned Transformer Capabilities via Inference-Time Techniques [1.14219428942199]
Large language models have transformed natural language processing, yet supervised fine-tuning (SFT) remains computationally intensive.<n>This paper formally proves that capabilities acquired through SFT can be approximated by a base transformer model.<n>We extend these results to practical scenarios with finite context lengths and partial dataset access.
arXiv Detail & Related papers (2025-06-09T08:37:19Z) - Nearly Optimal Differentially Private ReLU Regression [18.599299269974498]
We investigate one of the most fundamental non learning problems, ReLU regression, in the Differential Privacy (DP) model.
We show that it is possible to achieve an upper bound of $TildeO(fracd2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 vareps
arXiv Detail & Related papers (2025-03-08T02:09:47Z) - Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models [65.71506381302815]
We propose amortize the cost of sampling from a posterior distribution of the form $p(mathbfxmidmathbfy) propto p_theta(mathbfx)$.
For many models and constraints of interest, the posterior in the noise space is smoother than the posterior in the data space, making it more amenable to such amortized inference.
arXiv Detail & Related papers (2025-02-10T19:49:54Z) - Learning Networks from Wide-Sense Stationary Stochastic Processes [7.59499154221528]
A key inference problem here is to learn edge connectivity from node outputs (potentials)
We use a Whittle's maximum likelihood estimator (MLE) to learn the support of $Last$ from temporally correlated samples.
We show that the MLE problem is strictly convex, admitting a unique solution.
arXiv Detail & Related papers (2024-12-04T23:14:00Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators.
Key to our solution is a novel projection technique based on ideas from harmonic analysis.
Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z) - Regret-Optimal Federated Transfer Learning for Kernel Regression with Applications in American Option Pricing [8.723136784230906]
We propose an optimal iterative scheme for federated transfer learning, where a central planner has access to datasets.
Our objective is to minimize the cumulative deviation of the generated parameters $thetai(t)_t=0T$ across all $T$ iterations.
By leveraging symmetries within the regret-optimal algorithm, we develop a nearly regret $_optimal that runs with $mathcalO(Np2)$ fewer elementary operations.
arXiv Detail & Related papers (2023-09-08T19:17:03Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - Efficient Sampling of Stochastic Differential Equations with Positive
Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix.
It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$
Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z) - Generalization and Stability of Interpolating Neural Networks with
Minimal Width [37.908159361149835]
We investigate the generalization and optimization of shallow neural-networks trained by gradient in the interpolating regime.
We prove the training loss number minimizations $m=Omega(log4 (n))$ neurons and neurons $Tapprox n$.
With $m=Omega(log4 (n))$ neurons and $Tapprox n$, we bound the test loss training by $tildeO (1/)$.
arXiv Detail & Related papers (2023-02-18T05:06:15Z) - ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
We introduce a new tool for BFG convex optimization (SCO): a Reweighted Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density.
We develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings.
arXiv Detail & Related papers (2023-01-01T18:51:29Z) - Near Sample-Optimal Reduction-based Policy Learning for Average Reward
MDP [58.13930707612128]
This work considers the sample complexity of obtaining an $varepsilon$-optimal policy in an average reward Markov Decision Process (AMDP)
We prove an upper bound of $widetilde O(H varepsilon-3 ln frac1delta)$ samples per state-action pair, where $H := sp(h*)$ is the span of bias of any optimal policy, $varepsilon$ is the accuracy and $delta$ is the failure probability.
arXiv Detail & Related papers (2022-12-01T15:57:58Z) - Minimax Optimal Quantization of Linear Models: Information-Theoretic
Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements.
We derive an information-theoretic lower bound for the minimax risk under this setting.
We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z) - Locally Differentially Private Reinforcement Learning for Linear Mixture
Markov Decision Processes [78.27542864367821]
Reinforcement learning (RL) algorithms can be used to provide personalized services, which rely on users' private and sensitive data.
To protect the users' privacy, privacy-preserving RL algorithms are in demand.
We propose a novel $(varepsilon, delta)$-LDP algorithm for learning a class of Markov decision processes (MDPs) dubbed linear mixture MDPs.
arXiv Detail & Related papers (2021-10-19T17:44:09Z) - Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs
with a Generative Model [3.749193647980305]
This paper considers a Markov decision process (MDP) that admits a set of state-action features.
We show that a model-based approach (resp.$$Q-learning) provably learns an $varepsilon$-optimal policy with high probability.
arXiv Detail & Related papers (2021-05-28T17:49:39Z) - Hybrid Stochastic-Deterministic Minibatch Proximal Gradient:
Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z) - Large-time asymptotics in deep learning [0.0]
We consider the impact of the final time $T$ (which may indicate the depth of a corresponding ResNet) in training.
For the classical $L2$--regularized empirical risk minimization problem, we show that the training error is at most of the order $mathcalOleft(frac1Tright)$.
In the setting of $ellp$--distance losses, we prove that both the training error and the optimal parameters are at most of the order $mathcalOleft(e-mu
arXiv Detail & Related papers (2020-08-06T07:33:17Z) - Convergence of Sparse Variational Inference in Gaussian Processes
Regression [29.636483122130027]
We show that a method with an overall computational cost of $mathcalO(log N)2D(loglog N)2)$ can be used to perform inference.
arXiv Detail & Related papers (2020-08-01T19:23:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.