Drift Estimation for Diffusion Processes Using Neural Networks Based on Discretely Observed Independent Paths
- URL: http://arxiv.org/abs/2511.11161v1
- Date: Fri, 14 Nov 2025 10:56:52 GMT
- Title: Drift Estimation for Diffusion Processes Using Neural Networks Based on Discretely Observed Independent Paths
- Authors: Yuzhen Zhao, Yating Liu, Marc Hoffmann,
- Abstract summary: This paper addresses the nonparametric estimation of the drift function over a compact domain for a time-homogeneous diffusion process.<n>We propose a neural network-based estimator and derive a non-asymptotic convergence rate, decomposed into a training error, an approximation error, and a diffusion-related term scaling as $log N/N$.
- Score: 10.214978140001858
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper addresses the nonparametric estimation of the drift function over a compact domain for a time-homogeneous diffusion process, based on high-frequency discrete observations from $N$ independent trajectories. We propose a neural network-based estimator and derive a non-asymptotic convergence rate, decomposed into a training error, an approximation error, and a diffusion-related term scaling as ${\log N}/{N}$. For compositional drift functions, we establish an explicit rate. In the numerical experiments, we consider a drift function with local fluctuations generated by a double-layer compositional structure featuring local oscillations, and show that the empirical convergence rate becomes independent of the input dimension $d$. Compared to the $B$-spline method, the neural network estimator achieves better convergence rates and more effectively captures local features, particularly in higher-dimensional settings.
Related papers
- Plug-In Classification of Drift Functions in Diffusion Processes Using Neural Networks [10.520846698070818]
We study a supervised multiclass classification problem for diffusion processes, where each class is characterized by a distinct drift function and trajectories are observed at discrete times.<n>We propose a neural network-based plug-in classifier that estimates the drift functions for each class from independent sample paths and assigns labels based on a Bayes-type decision rule.
arXiv Detail & Related papers (2026-02-02T20:48:01Z) - Low-Dimensional Adaptation of Rectified Flow: A New Perspective through the Lens of Diffusion and Stochastic Localization [59.04314685837778]
Rectified flow (RF) has gained considerable popularity due to its generation efficiency and state-of-the-art performance.<n>In this paper, we investigate the degree to which RF automatically adapts to the intrinsic low dimensionality of the support of the target distribution to accelerate sampling.<n>We show that, using a carefully designed choice of the time-discretization scheme and with sufficiently accurate drift estimates, the RF sampler enjoys an complexity of order $O(k/varepsilon)$.
arXiv Detail & Related papers (2026-01-21T22:09:27Z) - Inference-Time Alignment for Diffusion Models via Doob's Matching [16.416975860645724]
Inference-time alignment for diffusion models aims to adapt a pre-trained diffusion model toward a target distribution without retraining the base score network.<n>We introduce Doob's matching, a novel framework for guidance estimation grounded in Doob's $h$-transform.<n>We prove non-asymptotic convergence guarantees for the generated distributions in the 2-Wasserstein distance.
arXiv Detail & Related papers (2026-01-10T10:28:06Z) - Generative Modeling with Continuous Flows: Sample Complexity of Flow Matching [60.37045080890305]
We provide the first analysis of the sample complexity for flow-matching based generative models.<n>We decompose the velocity field estimation error into neural-network approximation error, statistical error due to the finite sample size, and optimization error due to the finite number of optimization steps for estimating the velocity field.
arXiv Detail & Related papers (2025-12-01T05:14:25Z) - Semi-Implicit Functional Gradient Flow for Efficient Sampling [30.32233517392456]
We propose a functional gradient ParVI method that uses perturbed particles with Gaussian noise as the approximation family.<n>We show that the corresponding functional gradient flow, which can be estimated via denoising score matching with neural networks, exhibits strong theoretical convergence guarantees.<n>In addition, we present an adaptive version of our method that automatically selects the appropriate noise magnitude during sampling.
arXiv Detail & Related papers (2024-10-23T15:00:30Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - Analyzing Neural Network-Based Generative Diffusion Models through Convex Optimization [45.72323731094864]
We present a theoretical framework to analyze two-layer neural network-based diffusion models.
We prove that training shallow neural networks for score prediction can be done by solving a single convex program.
Our results provide a precise characterization of what neural network-based diffusion models learn in non-asymptotic settings.
arXiv Detail & Related papers (2024-02-03T00:20:25Z) - A Neural Network-Based Enrichment of Reproducing Kernel Approximation
for Modeling Brittle Fracture [0.0]
An improved version of the neural network-enhanced Reproducing Kernel Particle Method (NN-RKPM) is proposed for modeling brittle fracture.
The effectiveness of the proposed method is demonstrated by a series of numerical examples involving damage propagation and branching.
arXiv Detail & Related papers (2023-07-04T21:52:09Z) - Learning Discretized Neural Networks under Ricci Flow [48.47315844022283]
We study Discretized Neural Networks (DNNs) composed of low-precision weights and activations.<n>DNNs suffer from either infinite or zero gradients due to the non-differentiable discrete function during training.
arXiv Detail & Related papers (2023-02-07T10:51:53Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Convex Analysis of the Mean Field Langevin Dynamics [49.66486092259375]
convergence rate analysis of the mean field Langevin dynamics is presented.
$p_q$ associated with the dynamics allows us to develop a convergence theory parallel to classical results in convex optimization.
arXiv Detail & Related papers (2022-01-25T17:13:56Z) - Neural Estimation of Statistical Divergences [24.78742908726579]
A modern method for estimating statistical divergences relies on parametrizing an empirical variational form by a neural network (NN)
In particular, there is a fundamental tradeoff between the two sources of error involved: approximation and empirical estimation.
We show that neural estimators with a slightly different NN growth-rate are near minimax rate-optimal, achieving the parametric convergence rate up to logarithmic factors.
arXiv Detail & Related papers (2021-10-07T17:42:44Z) - Asymptotic convergence rate of Dropout on shallow linear neural networks [0.0]
We analyze the convergence on objective functions induced by Dropout and Dropconnect, when applying them to shallow linear Neural Networks.
We obtain a local convergence proof of the gradient flow and a bound on the rate that depends on the data, the rate probability, and the width of the NN.
arXiv Detail & Related papers (2020-12-01T19:02:37Z) - Understanding Approximate Fisher Information for Fast Convergence of
Natural Gradient Descent in Wide Neural Networks [13.572168969227011]
Natural Gradient Descent (NGD) helps to accelerate the convergence of descent gradient dynamics.
It requires approximations in large-scale deep neural networks because of its high computational cost.
Empirical studies have confirmed that some NGD methods with approximate Fisher information converge sufficiently fast in practice.
arXiv Detail & Related papers (2020-10-02T09:12:07Z) - Optimal Rates for Averaged Stochastic Gradient Descent under Neural
Tangent Kernel Regime [50.510421854168065]
We show that the averaged gradient descent can achieve the minimax optimal convergence rate.
We show that the target function specified by the NTK of a ReLU network can be learned at the optimal convergence rate.
arXiv Detail & Related papers (2020-06-22T14:31:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.