Asymptotic Theory of $\ell_1$-Regularized PDE Identification from a
Single Noisy Trajectory
- URL: http://arxiv.org/abs/2103.07045v1
- Date: Fri, 12 Mar 2021 02:23:04 GMT
- Title: Asymptotic Theory of $\ell_1$-Regularized PDE Identification from a
Single Noisy Trajectory
- Authors: Yuchen He, Namjoon Suh, Xiaoming Huo, Sungha Kang, Yajun Mei
- Abstract summary: We prove the support recovery for a general class of linear and nonlinear evolutionary partial differential equation (PDE) identification from a single noisy trajectory.
We provide a set of sufficient conditions which guarantee that, from a single trajectory data denoised by a Local-Polynomial filter, the support of $mathbfc(lambda)$ally converges to the true signed-support associated with the underlying PDE.
- Score: 2.0299248281970956
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We prove the support recovery for a general class of linear and nonlinear
evolutionary partial differential equation (PDE) identification from a single
noisy trajectory using $\ell_1$ regularized Pseudo-Least Squares
model~($\ell_1$-PsLS). In any associative $\mathbb{R}$-algebra generated by
finitely many differentiation operators that contain the unknown PDE operator,
applying $\ell_1$-PsLS to a given data set yields a family of candidate models
with coefficients $\mathbf{c}(\lambda)$ parameterized by the regularization
weight $\lambda\geq 0$. The trace of $\{\mathbf{c}(\lambda)\}_{\lambda\geq 0}$
suffers from high variance due to data noises and finite difference
approximation errors. We provide a set of sufficient conditions which guarantee
that, from a single trajectory data denoised by a Local-Polynomial filter, the
support of $\mathbf{c}(\lambda)$ asymptotically converges to the true
signed-support associated with the underlying PDE for sufficiently many data
and a certain range of $\lambda$. We also show various numerical experiments to
validate our theory.
Related papers
- Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ under isotropic Gaussian data.
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ of arbitrary link function with a sample and runtime complexity of $n asymp T asymp C(q) cdot d
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Further Understanding of a Local Gaussian Process Approximation: Characterising Convergence in the Finite Regime [1.3518297878940662]
We show that common choices of kernel functions for a highly accurate and massively scalable GPnn regression model exhibit gradual convergence to behaviour as dataset-size $n$ increases.
Similar bounds can be found under model misspecification and combined to give overall rates of convergence of both MSE and an important calibration metric.
arXiv Detail & Related papers (2024-04-09T10:47:01Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Exact Fractional Inference via Re-Parametrization & Interpolation
between Tree-Re-Weighted- and Belief Propagation- Algorithms [0.5348370085388683]
Inference efforts -- required to compute partition function, $Z$, of an Ising model over a graph of $N$ spins" -- are most likely exponential in $N$.
We show how to express $Z$ as a product, $forall lambda: Z=Z(lambda)cal Z(lambda)$, where the multiplicative correction, $cal Z(lambda)$, is an expectation over a node-independent probability distribution built from node-wise fractional marginals.
arXiv Detail & Related papers (2023-01-25T00:50:28Z) - Near-optimal fitting of ellipsoids to random points [68.12685213894112]
A basic problem of fitting an ellipsoid to random points has connections to low-rank matrix decompositions, independent component analysis, and principal component analysis.
We resolve this conjecture up to logarithmic factors by constructing a fitting ellipsoid for some $n = Omega(, d2/mathrmpolylog(d),)$.
Our proof demonstrates feasibility of the least squares construction of Saunderson et al. using a convenient decomposition of a certain non-standard random matrix.
arXiv Detail & Related papers (2022-08-19T18:00:34Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Learning elliptic partial differential equations with randomized linear
algebra [2.538209532048867]
We show that one can construct an approximant to $G$ that converges almost surely.
The quantity $0Gamma_epsilonleq 1$ characterizes the quality of the training dataset.
arXiv Detail & Related papers (2021-01-31T16:57:59Z) - Convergence of Sparse Variational Inference in Gaussian Processes
Regression [29.636483122130027]
We show that a method with an overall computational cost of $mathcalO(log N)2D(loglog N)2)$ can be used to perform inference.
arXiv Detail & Related papers (2020-08-01T19:23:34Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.