Related papers: Neural Networks for Tamed Milstein Approximation of SDEs with Additive Symmetric Jump Noise Driven by a Poisson Random Measure

Neural Networks for Tamed Milstein Approximation of SDEs with Additive Symmetric Jump Noise Driven by a Poisson Random Measure

URL: http://arxiv.org/abs/2507.04417v2
Date: Wed, 09 Jul 2025 12:33:51 GMT
Title: Neural Networks for Tamed Milstein Approximation of SDEs with Additive Symmetric Jump Noise Driven by a Poisson Random Measure
Authors: Jose-Hermenegildo Ramirez-Gonzalez, Ying Sun,
Abstract summary: We propose a framework that integrates the Tamed-Milstein scheme with neural networks employed as non-parametric function approximators.<n>The proposed methodology constitutes a flexible alternative for inference in systems with state-dependent noise and discontinuities driven by L'evy processes.
Score: 2.845817138242963
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work aims to estimate the drift and diffusion functions in stochastic differential equations (SDEs) driven by a particular class of L\'evy processes with finite jump intensity, using neural networks. We propose a framework that integrates the Tamed-Milstein scheme with neural networks employed as non-parametric function approximators. Estimation is carried out in a non-parametric fashion for the drift function $f: \mathbb{Z} \to \mathbb{R}$, the diffusion coefficient $g: \mathbb{Z} \to \mathbb{R}$. The model of interest is given by \[ dX(t) = \xi + f(X(t))\, dt + g(X(t))\, dW_t + \gamma \int_{\mathbb{Z}} z\, N(dt,dz), \] where $W_t$ is a standard Brownian motion, and $N(dt,dz)$ is a Poisson random measure on $(\mathbb{R}_{+} \times \mathbb{Z}$, $\mathcal{B} (\mathbb{R}_{+}) \otimes \mathcal{Z}$, $\lambda( \Lambda \otimes v))$, with $\lambda, \gamma > 0$, $\Lambda$ being the Lebesgue measure on $\mathbb{R}_{+}$, and $v$ a finite measure on the measurable space $(\mathbb{Z}, \mathcal{Z})$. Neural networks are used as non-parametric function approximators, enabling the modeling of complex nonlinear dynamics without assuming restrictive functional forms. The proposed methodology constitutes a flexible alternative for inference in systems with state-dependent noise and discontinuities driven by L\'evy processes.

Related papers

Learning quadratic neural networks in high dimensions: SGD dynamics and scaling laws [21.18373933718468]
We study the optimization and sample complexity of gradient-based training of a two-layer neural network with quadratic activation function in the high-dimensional regime.<n>We present a sharp analysis of the dynamics in the feature learning regime, for both the population limit and the finite-sample discretization.
arXiv Detail & Related papers (2025-08-05T17:57:56Z)
Revolutionizing Fractional Calculus with Neural Networks: Voronovskaya-Damasclin Theory for Next-Generation AI Systems [0.0]
This work introduces rigorous convergence rates for neural network operators activated by symmetrized and hyperbolic perturbed functions.<n>We extend classical approximation theory to fractional calculus via Caputo derivatives.
arXiv Detail & Related papers (2025-04-01T21:03:00Z)
$p$-Adic Polynomial Regression as Alternative to Neural Network for Approximating $p$-Adic Functions of Many Variables [55.2480439325792]
A regression model is constructed that allows approximating continuous functions with any degree of accuracy.<n>The proposed model can be considered as a simple alternative to possible $p$-adic models based on neural network architecture.
arXiv Detail & Related papers (2025-03-30T15:42:08Z)
Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations [40.77319247558742]
We study the computational complexity of learning a target function $f_*:mathbbRdtomathbbR$ with additive structure. We prove that a large subset of $f_*$ can be efficiently learned by gradient training of a two-layer neural network.
arXiv Detail & Related papers (2024-06-17T17:59:17Z)
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$<n>We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z)
A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks. We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks. Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z)
Statistical Spatially Inhomogeneous Diffusion Inference [15.167120574781153]
Inferring a diffusion equation from discretely-observed measurements is a statistical challenge. We propose neural network-based estimators of both the drift $boldsymbolb$ and the spatially-inhomogeneous diffusion tensor $D = SigmaSigmaT$.
arXiv Detail & Related papers (2023-12-10T06:52:50Z)
A Unified Framework for Uniform Signal Recovery in Nonlinear Generative Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously. Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples. We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z)
Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling. We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z)
The Schr\"odinger equation for the Rosen-Morse type potential revisited with applications [0.0]
We rigorously solve the time-independent Schr"odinger equation for the Rosen-Morse type potential. The resolution of this problem is used to show that the kinks of the non-linear Klein-Gordon equation with $varphi2p+2$ type potentials are stable.
arXiv Detail & Related papers (2023-04-12T18:43:39Z)
An Over-parameterized Exponential Regression [18.57735939471469]
Recent developments in the field of Large Language Models (LLMs) have sparked interest in the use of exponential activation functions. We define the neural function $F: mathbbRd times m times mathbbRd times mathbbRd times mathbbRd times mathbbRd times mathbbRd times mathbbRd times mathbbRd
arXiv Detail & Related papers (2023-03-29T07:29:07Z)
Learning a Single Neuron with Adversarial Label Noise via Gradient Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations. The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z)
Markovian Repeated Interaction Quantum Systems [0.0]
We study a class of dynamical semigroups $(mathbbLn)_ninmathbbN$ that emerge, by a Feynman--Kac type formalism, from a random quantum dynamical system. As a physical application, we consider the case where the $mathcalL_omega$'s are the reduced dynamical maps describing the repeated interactions of a system with thermal probes.
arXiv Detail & Related papers (2022-02-10T20:52:40Z)
Deep Learning in High Dimension: Neural Network Approximation of Analytic Functions in $L^2(\mathbb{R}^d,\gamma_d)$ [0.0]
We prove expression rates for analytic functions $f:mathbbRdtomathbbR$ in the norm of $L2(mathbbRd,gamma_d)$. We consider in particular ReLU and ReLU$k$ activations for integer $kgeq 2$. As an application, we prove expression rate bounds of deep ReLU-NNs for response surfaces of elliptic PDEs with log-Gaussian random field inputs.
arXiv Detail & Related papers (2021-11-13T09:54:32Z)
Random matrices in service of ML footprint: ternary random features with no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$. We propose a novel random technique, called Ternary Random Feature (TRF) The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z)
Learning stochastic dynamical systems with neural networks mimicking the Euler-Maruyama scheme [14.436723124352817]
We propose a data driven approach where parameters of the SDE are represented by a neural network with a built-in SDE integration scheme. The algorithm is applied to the geometric brownian motion and a version of the Lorenz-63 model.
arXiv Detail & Related papers (2021-05-18T11:41:34Z)
Interpolating Log-Determinant and Trace of the Powers of Matrix $\mathbf{A} + t \mathbf{B}$ [1.5002438468152661]
We develop methods for the functions $t mapsto log det left( mathbfA + t mathbfB right)$ and $t mapsto nametraceleft (mathbfA + t mathbfB)p right)$ where the matrices $mathbfA$ and $mathbfB$ are Hermitian and positive (semi) definite and $p$ and $t$ are real variables.
arXiv Detail & Related papers (2020-09-15T23:11:17Z)
Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss. For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$. For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.