Newton-Puiseux Analysis for Interpretability and Calibration of Complex-Valued Neural Networks
- URL: http://arxiv.org/abs/2504.19176v2
- Date: Mon, 13 Oct 2025 20:27:15 GMT
- Title: Newton-Puiseux Analysis for Interpretability and Calibration of Complex-Valued Neural Networks
- Authors: Piotr Migus,
- Abstract summary: Complex neural networks (CVNNs) are suitable for handling phase-sensitive signals, including electrocardiography (ECG), radar/sonar, and wireless in-phase/quadrature (I/Q) streams.<n>We present a Newton-Puiseux framework that examines the emphlocal decision geometry of a trained CVNN by fitting a small, kink-aware surrogate.<n>Our phase-aware analysis identifies sensitive directions and enhances Expected Error in two case studies beyond a controlled $C2$ synthetic benchmark.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Complex-valued neural networks (CVNNs) are particularly suitable for handling phase-sensitive signals, including electrocardiography (ECG), radar/sonar, and wireless in-phase/quadrature (I/Q) streams. Nevertheless, their \emph{interpretability} and \emph{probability calibration} remain insufficiently investigated. In this work, we present a Newton--Puiseux framework that examines the \emph{local decision geometry} of a trained CVNN by (i) fitting a small, kink-aware polynomial surrogate to the \emph{logit difference} in the vicinity of uncertain inputs, and (ii) factorizing this surrogate using Newton--Puiseux expansions to derive analytic branch descriptors, including exponents, multiplicities, and orientations. These descriptors provide phase-aligned directions that induce class flips in the original network and allow for a straightforward, \emph{multiplicity-guided} temperature adjustment for improved calibration. We outline assumptions and diagnostic measures under which the surrogate proves informative and characterize potential failure modes arising from piecewise-holomorphic activations (e.g., modReLU). Our phase-aware analysis identifies sensitive directions and enhances Expected Calibration Error in two case studies beyond a controlled $\C^2$ synthetic benchmark -- namely, the MIT--BIH arrhythmia (ECG) dataset and RadioML 2016.10a (wireless modulation) -- when compared to uncalibrated softmax and standard post-hoc baselines. We also present confidence intervals, non-parametric tests, and quantify sensitivity to inaccuracies in estimating branch multiplicity. Crucially, this method requires no modifications to the architecture and applies to any CVNN with complex logits transformed to real moduli.
Related papers
- Equivariant Evidential Deep Learning for Interatomic Potentials [55.6997213490859]
Uncertainty quantification is critical for assessing the reliability of machine learning interatomic potentials in molecular dynamics simulations.<n>Existing UQ approaches for MLIPs are often limited by high computational cost or suboptimal performance.<n>We propose textitEquivariant Evidential Deep Learning for Interatomic Potentials ($texte2$IP), a backbone-agnostic framework that models atomic forces and their uncertainty jointly.
arXiv Detail & Related papers (2026-02-11T02:00:25Z) - Generalizing GNNs with Tokenized Mixture of Experts [75.8310720413187]
We show that improving stability requires reducing reliance on shift-sensitive features, leaving an irreducible worst-case generalization floor.<n>We propose STEM-GNN, a pretrain-then-finetune framework with a mixture-of-experts encoder for diverse computation paths.<n>Across nine node, link, and graph benchmarks, STEM-GNN achieves a stronger three-way balance, improving robustness to degree/homophily shifts and to feature/edge corruptions while remaining competitive on clean graphs.
arXiv Detail & Related papers (2026-02-09T22:48:30Z) - From Evaluation to Design: Using Potential Energy Surface Smoothness Metrics to Guide Machine Learning Interatomic Potential Architectures [12.68400434984463]
MLIPs fail to reproduce the physical smoothness of the quantum potential energy surface.<n>Existing evaluations, such as microcanonical molecular dynamics, are computationally expensive and primarily probe near-equilibrium states.<n>We introduce the Bond Smoothness Characterization Test (BSCT) to improve evaluation metrics for MLIPs.
arXiv Detail & Related papers (2026-02-04T18:50:10Z) - Micro-Macro Tensor Neural Surrogates for Uncertainty Quantification in Collisional Plasma [3.7863228436382013]
Plasma equations exhibit pronounced sensitivity to microscopic perturbations in model parameters and data.<n>Cost of uncertainty sampling, the high-dimensional phase space, and multiscale stiffness pose severe challenges to both computational efficiency and error control.<n>We present a variance-reduced Monte Carlo framework for UQ in which neural network surrogates replace the costly evaluations of the Landau collision term.
arXiv Detail & Related papers (2025-12-30T13:07:35Z) - Ising on the donut: Regimes of topological quantum error correction from statistical mechanics [0.0]
Utility-scale quantum computers require quantum error correcting codes with large numbers of physical qubits to achieve sufficiently low logical error rates.<n>Here we exploit an exact mapping, from a toric code under bit-flip noise that is post-selected on being syndrome free to the exactly-solvable two-dimensional Ising model on a torus, to derive an analytic solution for the logical failure rate.
arXiv Detail & Related papers (2025-12-11T08:06:23Z) - Deterministic Coreset Construction via Adaptive Sensitivity Trimming [0.2864713389096699]
We develop a framework for deterministic coreset construction in empirical risk minimization.<n>Our central contribution is the Adaptive Deterministic Uniform-Weight Trimming (ADUWT) algorithm.<n>We conclude with open problems on instance-optimal oracles, deterministic streaming, and fairness-constrained ERM.
arXiv Detail & Related papers (2025-08-25T17:19:13Z) - Non-Hermitian Quantum Metrology Enhancement and Skin Effect Suppression in PT-Symmetric Bardeen-Cooper-Schrieffer Chains [0.0]
We outline a theoretical framework for quantum metrology in non-Hermitian systems.<n>Through biorthogonal quantum Fisher information analysis, we identify two distinct regimes.<n>NHSE suppresses sensitivity exponentially, while $mathcalPT$-symmetry enables Heisenberg-limited enhancement.
arXiv Detail & Related papers (2025-08-06T18:54:45Z) - Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - Approximation Bounds for Transformer Networks with Application to Regression [9.549045683389085]
We explore the approximation capabilities of Transformer networks for H"older and Sobolev functions.
We establish novel upper bounds for standard Transformer networks approxing sequence-to-sequence mappings.
We show that if the self-attention layer in a Transformer can perform column averaging, the network can approximate sequence-to-sequence H"older functions.
arXiv Detail & Related papers (2025-04-16T15:25:58Z) - Uncertainty Quantification From Scaling Laws in Deep Neural Networks [0.0]
Quantifying uncertainty from machine learning analyses is critical to their use in the physical sciences.<n>We compute the mean $mu_mathcalL$ and variance $sigma_mathcalL$ for an ensemble of multi-layer perceptrons.<n>We compare empirically to the results from finite-width networks for three example tasks: MNIST classification, CIFAR classification and calorimeter energy regression.
arXiv Detail & Related papers (2025-03-07T21:15:11Z) - Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion [55.95767828747407]
In domains such as molecular and protein generation, physical systems exhibit inherent symmetries that are critical to model.<n>We present a framework that reduces training variance and provides a provably lower-variance gradient estimator.<n>We also present a practical implementation of this estimator incorporating the loss and sampling procedure through a method we call Orbit Diffusion.
arXiv Detail & Related papers (2025-02-14T03:26:57Z) - Theoretical limits of descending $\ell_0$ sparse-regression ML algorithms [0.0]
We develop a generic analytical program for studying performance of the emphmaximum-likelihood (ML) decoding.
Key ML performance parameter, the residual emphroot mean square error ($textbfRMSE$) is uncovered to exhibit the so-called emphphase-transition (PT) phenomenon.
Concrete implementation and practical relevance of the Fl RDT typically rely on the ability to conduct a sizeable set of the underlying numerical evaluations.
arXiv Detail & Related papers (2024-10-10T06:33:41Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Risk Bounds for Mixture Density Estimation on Compact Domains via the $h$-Lifted Kullback--Leibler Divergence [2.8074364079901017]
We introduce the $h$-lifted Kullback--Leibler (KL) divergence as a generalization of the standard KL divergence.<n>We develop a procedure for the computation of the corresponding maximum $h$-lifted likelihood estimators.
arXiv Detail & Related papers (2024-04-19T02:31:34Z) - Tighter Learning Guarantees on Digital Computers via Concentration of Measure on Finite Spaces [7.373617024876726]
We derive a family of generalization bounds $c_m/N1/ (2vee m)_m=1infty$ tailored for learning models on digital computers.<n>Adjusting the parameter $m$ according to $N$ results in significantly tighter generalization bounds for practical sample sizes $N$.<n>Our family of generalization bounds are formulated based on our new non-asymptotic result for concentration of measure in finite metric spaces.
arXiv Detail & Related papers (2024-02-08T11:23:11Z) - Last-Iterate Convergence of Adaptive Riemannian Gradient Descent for Equilibrium Computation [52.73824786627612]
This paper establishes new convergence results for textitgeodesic strongly monotone games.<n>Our key result shows that RGD attains last-iterate linear convergence in a textitgeometry-agnostic fashion.<n>Overall, this paper presents the first geometry-agnostic last-iterate convergence analysis for games beyond the Euclidean settings.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Generalization and Stability of Interpolating Neural Networks with
Minimal Width [37.908159361149835]
We investigate the generalization and optimization of shallow neural-networks trained by gradient in the interpolating regime.
We prove the training loss number minimizations $m=Omega(log4 (n))$ neurons and neurons $Tapprox n$.
With $m=Omega(log4 (n))$ neurons and $Tapprox n$, we bound the test loss training by $tildeO (1/)$.
arXiv Detail & Related papers (2023-02-18T05:06:15Z) - Improved techniques for deterministic l2 robustness [63.34032156196848]
Training convolutional neural networks (CNNs) with a strict 1-Lipschitz constraint under the $l_2$ norm is useful for adversarial robustness, interpretable gradients and stable training.
We introduce a procedure to certify robustness of 1-Lipschitz CNNs by replacing the last linear layer with a 1-hidden layer.
We significantly advance the state-of-the-art for standard and provable robust accuracies on CIFAR-10 and CIFAR-100.
arXiv Detail & Related papers (2022-11-15T19:10:12Z) - Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural
Networks on Coupled Ordinary Differential Equations [64.78260098263489]
In this work, we assess the ability of physics-informed neural networks (PINNs) to solve increasingly-complex coupled ordinary differential equations (ODEs)
We show that PINNs eventually fail to produce correct solutions to these benchmarks as their complexity increases.
We identify several reasons why this may be the case, including insufficient network capacity, poor conditioning of the ODEs, and high local curvature, as measured by the Laplacian of the PINN loss.
arXiv Detail & Related papers (2022-10-14T15:01:32Z) - Deep subspace encoders for continuous-time state-space identification [0.0]
Continuous-time (CT) models have shown an improved sample efficiency during learning.
The multifaceted CT state-space model identification problem remains to be solved in full.
This paper presents a novel estimation method that includes these aspects and that is able to obtain state-of-the-art results.
arXiv Detail & Related papers (2022-04-20T11:55:17Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - A New Framework for Variance-Reduced Hamiltonian Monte Carlo [88.84622104944503]
We propose a new framework of variance-reduced Hamiltonian Monte Carlo (HMC) methods for sampling from an $L$-smooth and $m$-strongly log-concave distribution.
We show that the unbiased gradient estimators, including SAGA and SVRG, based HMC methods achieve highest gradient efficiency with small batch size.
Experimental results on both synthetic and real-world benchmark data show that our new framework significantly outperforms the full gradient and gradient HMC approaches.
arXiv Detail & Related papers (2021-02-09T02:44:24Z) - NeCPD: An Online Tensor Decomposition with Optimal Stochastic Gradient
Descent [1.0953917735844645]
We propose a new efficient decomposition algorithm named NeCPD for non- efficient problem in multi-way online data based $(N)N.
We further apply this method in the field of reallife monitoring using structural datasets.
arXiv Detail & Related papers (2020-03-18T04:44:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.