Related papers: Machine learning approach to single-shot multiparameter estimation for the non-linear Schrödinger equation

Machine learning approach to single-shot multiparameter estimation for the non-linear Schrödinger equation

URL: http://arxiv.org/abs/2509.18479v1
Date: Tue, 23 Sep 2025 00:32:37 GMT
Title: Machine learning approach to single-shot multiparameter estimation for the non-linear Schrödinger equation
Authors: Louis Rossignol, Tangui Aladjidi, Myrann Baker-Rasooli, Quentin Glorieux,
Abstract summary: We train a neural network to invert the nonlinear Schr"odinger equation mapping.<n>Our model achieves a mean absolute error of $3.22%$ on 12,500 unseen test samples.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The nonlinear Schr\"odinger equation (NLSE) is a fundamental model for wave dynamics in nonlinear media ranging from optical fibers to Bose-Einstein condensates. Accurately estimating its parameters, which are often strongly correlated, from a single measurement remains a significant challenge. We address this problem by treating parameter estimation as an inverse problem and training a neural network to invert the NLSE mapping. We combine a fast numerical solver with a machine learning approach based on the ConvNeXt architecture and a multivariate Gaussian negative log-likelihood loss function. From single-shot field (density and phase) images, our model estimates three key parameters: the nonlinear coefficient $n_2$, the saturation intensity $I_{sat}$, and the linear absorption coefficient $\alpha$. Trained on 100,000 simulated images, the model achieves a mean absolute error of $3.22\%$ on 12,500 unseen test samples, demonstrating strong generalization and close agreement with ground-truth values. This approach provides an efficient route for characterizing nonlinear systems and has the potential to bridge theoretical modeling and experimental data when realistic noise is incorporated.

Related papers

Gradient Descent as a Perceptron Algorithm: Understanding Dynamics and Implicit Acceleration [67.12978375116599]
We show that the steps of gradient descent (GD) reduce to those of generalized perceptron algorithms.<n>This helps explain the optimization dynamics and the implicit acceleration phenomenon observed in neural networks.
arXiv Detail & Related papers (2025-12-12T14:16:35Z)
LaPON: A Lagrange's-mean-value-theorem-inspired operator network for solving PDEs and its application on NSE [8.014720523981385]
We propose LaPON, an operator network inspired by the Lagrange's mean value theorem.<n>It embeds prior knowledge directly into the neural network architecture instead of the loss function.<n>LaPON provides a scalable and reliable solution for high-fidelity fluid dynamics simulation.
arXiv Detail & Related papers (2025-05-18T10:45:17Z)
Coarse Graining with Neural Operators for Simulating Chaotic Systems [78.64101336150419]
Predicting the long-term behavior of chaotic systems is crucial for various applications such as climate modeling.<n>An alternative approach to such a full-resolved simulation is using a coarse grid and then correcting its errors through a temporalittext model.<n>We propose an alternative end-to-end learning approach using a physics-informed neural operator (PINO) that overcomes this limitation.
arXiv Detail & Related papers (2024-08-09T17:05:45Z)
A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks. We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks. Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z)
Distribution learning via neural differential equations: a nonparametric statistical perspective [1.4436965372953483]
This work establishes the first general statistical convergence analysis for distribution learning via ODE models trained through likelihood transformations. We show that the latter can be quantified via the $C1$-metric entropy of the class $mathcal F$. We then apply this general framework to the setting of $Ck$-smooth target densities, and establish nearly minimax-optimal convergence rates for two relevant velocity field classes $mathcal F$: $Ck$ functions and neural networks.
arXiv Detail & Related papers (2023-09-03T00:21:37Z)
Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data. In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z)
Neural Inference of Gaussian Processes for Time Series Data of Quasars [72.79083473275742]
We introduce a new model that enables it to describe quasar spectra completely. We also introduce a new method of inference of Gaussian process parameters, which we call $textitNeural Inference$. The combination of both the CDRW model and Neural Inference significantly outperforms the baseline DRW and MLE.
arXiv Detail & Related papers (2022-11-17T13:01:26Z)
Git Re-Basin: Merging Models modulo Permutation Symmetries [3.5450828190071655]
We show how simple algorithms can be used to fit large networks in practice. We demonstrate the first (to our knowledge) demonstration of zero mode connectivity between independently trained models. We also discuss shortcomings in the linear mode connectivity hypothesis.
arXiv Detail & Related papers (2022-09-11T10:44:27Z)
A variational neural network approach for glacier modelling with nonlinear rheology [1.4438155481047366]
We first formulate the solution of non-Newtonian ice flow model into the minimizer of a variational integral with boundary constraints. The solution is then approximated by a deep neural network whose loss function is the variational integral plus soft constraint from the mixed boundary conditions. To address instability in real-world scaling, we re-normalize the input of the network at the first layer and balance the regularizing factors for each individual boundary.
arXiv Detail & Related papers (2022-09-05T18:23:59Z)
Inverting brain grey matter models with likelihood-free inference: a tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI. We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells. We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z)
Improved Prediction and Network Estimation Using the Monotone Single Index Multi-variate Autoregressive Model [34.529641317832024]
We develop a semi-parametric approach based on the monotone single-index multi-variate autoregressive model (SIMAM) We provide theoretical guarantees for dependent data and an alternating projected gradient descent algorithm. We demonstrate the superior performance both on simulated data and two real data examples.
arXiv Detail & Related papers (2021-06-28T12:32:29Z)
Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand. We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice. Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.