Related papers: On Emergences of Non-Classical Statistical Characteristics in Classical Neural Networks

On Emergences of Non-Classical Statistical Characteristics in Classical Neural Networks

URL: http://arxiv.org/abs/2603.04451v1
Date: Fri, 27 Feb 2026 06:56:25 GMT
Title: On Emergences of Non-Classical Statistical Characteristics in Classical Neural Networks
Authors: Hanyu Zhao, Yang Wu, Yuexian Hou,
Abstract summary: The Non-Classical Network (NCnet) is a simple classical neural architecture that stably exhibits non-classical statistical behaviors.<n>We find non-classicality, measured by the $S$ statistic of CHSH inequality, arises from competitions of hidden-layer neurons shared by multi-tasks.
Score: 19.89817204415622
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Inspired by measurement incompatibility and Bell-family inequalities in quantum mechanics, we propose the Non-Classical Network (NCnet), a simple classical neural architecture that stably exhibits non-classical statistical behaviors under typical and interpretable experimental setups. We find non-classicality, measured by the $S$ statistic of CHSH inequality, arises from gradient competitions of hidden-layer neurons shared by multi-tasks. Remarkably, even without physical links supporting explicit communication, one task head can implicitly sense the training task of other task heads via local loss oscillations, leading to non-local correlations in their training outcomes. Specifically, in the low-resource regime, the value of $S$ increases gradually with increasing resources and approaches toward its classical upper-bound 2, which implies that underfitting is alleviated with resources increase. As the model nears the critical scale required for adequate performance, $S$ may temporarily exceed 2. As resources continue to grow, $S$ then asymptotically decays down to and fluctuates around 2. Empirically, when model capacity is insufficient, $S$ is positively correlated with generalization performance, and the regime where $S$ first approaches $2$ often corresponding to good generalization. Overall, our results suggest that non-classical statistics can provide a novel perspective for understanding internal interactions and training dynamics of deep networks.

Related papers

Hybrid Quantum-Classical Neural Networks for Few-Shot Credit Risk Assessment [52.05742536403784]
This work tackles the challenge of few-shot credit risk assessment.<n>We design and implement a novel hybrid quantum-classical workflow.<n>A Quantum Neural Network (QNN) was trained via the parameter-shift rule.<n>On a real-world credit dataset of 279 samples, our QNN achieved a robust average AUC of 0.852 +/- 0.027 in simulations and yielded an impressive AUC of 0.88 in the hardware experiment.
arXiv Detail & Related papers (2025-09-17T08:36:05Z)
Dynamical Decoupling of Generalization and Overfitting in Large Two-Layer Networks [10.591718074748895]
We study the learning dynamics of large two-layer neural networks via dynamical mean field theory.<n>For large network width $m$, and large number of samples per input dimension $n/d$, the training dynamics exhibits a separation of timescales.
arXiv Detail & Related papers (2025-02-28T17:45:26Z)
Learning non-ideal genuine network nonlocality using causally inferred Bayesian neural network algorithms [0.688204255655161]
We introduce a scalable causally-inferred Bayesian learning framework called LHV layered neural network.<n>We show that machine learning approaches with foundational domain-specific constraints can greatly benefit the field of quantum foundations.
arXiv Detail & Related papers (2025-01-14T12:45:47Z)
Estimating the volumes of correlations sets in causal networks [0.41942958779358674]
Causal networks beyond that in paradigmatic Bell's theorem can lead to new kinds and applications of non-classicality. We show where the most disseminated tool in the community, is unable to detect a significant portion of the non-classical behaviors. We also show that the use interventions, a central tool in inference, can substantially our ability to witness non-classicality.
arXiv Detail & Related papers (2023-11-14T22:35:57Z)
On Excess Risk Convergence Rates of Neural Network Classifiers [8.329456268842227]
We study the performance of plug-in classifiers based on neural networks in a binary classification setting as measured by their excess risks. We analyze the estimation and approximation properties of neural networks to obtain a dimension-free, uniform rate of convergence.
arXiv Detail & Related papers (2023-09-26T17:14:10Z)
High-Level Parallelism and Nested Features for Dynamic Inference Cost and Top-Down Attention [4.051316555028782]
This paper introduces a novel network topology that seamlessly integrates dynamic inference cost with a top-down attention mechanism. Drawing inspiration from human perception, we combine sequential processing of generic low-level features with parallelism and nesting of high-level features. In terms of dynamic inference cost our methodology can achieve an exclusion of up to $73.48,%$ of parameters and $84.41,%$ fewer giga-multiply-accumulate (GMAC) operations.
arXiv Detail & Related papers (2023-08-09T08:49:29Z)
TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks. We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework. TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z)
Problem-Dependent Power of Quantum Neural Networks on Multi-Class Classification [83.20479832949069]
Quantum neural networks (QNNs) have become an important tool for understanding the physical world, but their advantages and limitations are not fully understood. Here we investigate the problem-dependent power of QCs on multi-class classification tasks. Our work sheds light on the problem-dependent power of QNNs and offers a practical tool for evaluating their potential merit.
arXiv Detail & Related papers (2022-12-29T10:46:40Z)
On the generalization of learning algorithms that do not converge [54.122745736433856]
Generalization analyses of deep learning typically assume that the training converges to a fixed point. Recent results indicate that in practice, the weights of deep neural networks optimized with gradient descent often oscillate indefinitely.
arXiv Detail & Related papers (2022-08-16T21:22:34Z)
The dilemma of quantum neural networks [63.82713636522488]
We show that quantum neural networks (QNNs) fail to provide any benefit over classical learning models. QNNs suffer from the severely limited effective model capacity, which incurs poor generalization on real-world datasets. These results force us to rethink the role of current QNNs and to design novel protocols for solving real-world problems with quantum advantages.
arXiv Detail & Related papers (2021-06-09T10:41:47Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss [0.0]
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks. We analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations.
arXiv Detail & Related papers (2020-02-11T15:42:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.