On Emergences of Non-Classical Statistical Characteristics in Classical Neural Networks
- URL: http://arxiv.org/abs/2603.04451v1
- Date: Fri, 27 Feb 2026 06:56:25 GMT
- Title: On Emergences of Non-Classical Statistical Characteristics in Classical Neural Networks
- Authors: Hanyu Zhao, Yang Wu, Yuexian Hou,
- Abstract summary: The Non-Classical Network (NCnet) is a simple classical neural architecture that stably exhibits non-classical statistical behaviors.<n>We find non-classicality, measured by the $S$ statistic of CHSH inequality, arises from competitions of hidden-layer neurons shared by multi-tasks.
- Score: 19.89817204415622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inspired by measurement incompatibility and Bell-family inequalities in quantum mechanics, we propose the Non-Classical Network (NCnet), a simple classical neural architecture that stably exhibits non-classical statistical behaviors under typical and interpretable experimental setups. We find non-classicality, measured by the $S$ statistic of CHSH inequality, arises from gradient competitions of hidden-layer neurons shared by multi-tasks. Remarkably, even without physical links supporting explicit communication, one task head can implicitly sense the training task of other task heads via local loss oscillations, leading to non-local correlations in their training outcomes. Specifically, in the low-resource regime, the value of $S$ increases gradually with increasing resources and approaches toward its classical upper-bound 2, which implies that underfitting is alleviated with resources increase. As the model nears the critical scale required for adequate performance, $S$ may temporarily exceed 2. As resources continue to grow, $S$ then asymptotically decays down to and fluctuates around 2. Empirically, when model capacity is insufficient, $S$ is positively correlated with generalization performance, and the regime where $S$ first approaches $2$ often corresponding to good generalization. Overall, our results suggest that non-classical statistics can provide a novel perspective for understanding internal interactions and training dynamics of deep networks.
Related papers
- Hybrid Quantum-Classical Neural Networks for Few-Shot Credit Risk Assessment [52.05742536403784]
This work tackles the challenge of few-shot credit risk assessment.<n>We design and implement a novel hybrid quantum-classical workflow.<n>A Quantum Neural Network (QNN) was trained via the parameter-shift rule.<n>On a real-world credit dataset of 279 samples, our QNN achieved a robust average AUC of 0.852 +/- 0.027 in simulations and yielded an impressive AUC of 0.88 in the hardware experiment.
arXiv Detail & Related papers (2025-09-17T08:36:05Z) - Dynamical Decoupling of Generalization and Overfitting in Large Two-Layer Networks [10.591718074748895]
We study the learning dynamics of large two-layer neural networks via dynamical mean field theory.<n>For large network width $m$, and large number of samples per input dimension $n/d$, the training dynamics exhibits a separation of timescales.
arXiv Detail & Related papers (2025-02-28T17:45:26Z) - Learning non-ideal genuine network nonlocality using causally inferred Bayesian neural network algorithms [0.688204255655161]
We introduce a scalable causally-inferred Bayesian learning framework called LHV layered neural network.<n>We show that machine learning approaches with foundational domain-specific constraints can greatly benefit the field of quantum foundations.
arXiv Detail & Related papers (2025-01-14T12:45:47Z) - Estimating the volumes of correlations sets in causal networks [0.41942958779358674]
Causal networks beyond that in paradigmatic Bell's theorem can lead to new kinds and applications of non-classicality.
We show where the most disseminated tool in the community, is unable to detect a significant portion of the non-classical behaviors.
We also show that the use interventions, a central tool in inference, can substantially our ability to witness non-classicality.
arXiv Detail & Related papers (2023-11-14T22:35:57Z) - On Excess Risk Convergence Rates of Neural Network Classifiers [8.329456268842227]
We study the performance of plug-in classifiers based on neural networks in a binary classification setting as measured by their excess risks.
We analyze the estimation and approximation properties of neural networks to obtain a dimension-free, uniform rate of convergence.
arXiv Detail & Related papers (2023-09-26T17:14:10Z) - High-Level Parallelism and Nested Features for Dynamic Inference Cost and Top-Down Attention [4.051316555028782]
This paper introduces a novel network topology that seamlessly integrates dynamic inference cost with a top-down attention mechanism.
Drawing inspiration from human perception, we combine sequential processing of generic low-level features with parallelism and nesting of high-level features.
In terms of dynamic inference cost our methodology can achieve an exclusion of up to $73.48,%$ of parameters and $84.41,%$ fewer giga-multiply-accumulate (GMAC) operations.
arXiv Detail & Related papers (2023-08-09T08:49:29Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Problem-Dependent Power of Quantum Neural Networks on Multi-Class
Classification [83.20479832949069]
Quantum neural networks (QNNs) have become an important tool for understanding the physical world, but their advantages and limitations are not fully understood.
Here we investigate the problem-dependent power of QCs on multi-class classification tasks.
Our work sheds light on the problem-dependent power of QNNs and offers a practical tool for evaluating their potential merit.
arXiv Detail & Related papers (2022-12-29T10:46:40Z) - On the generalization of learning algorithms that do not converge [54.122745736433856]
Generalization analyses of deep learning typically assume that the training converges to a fixed point.
Recent results indicate that in practice, the weights of deep neural networks optimized with gradient descent often oscillate indefinitely.
arXiv Detail & Related papers (2022-08-16T21:22:34Z) - The dilemma of quantum neural networks [63.82713636522488]
We show that quantum neural networks (QNNs) fail to provide any benefit over classical learning models.
QNNs suffer from the severely limited effective model capacity, which incurs poor generalization on real-world datasets.
These results force us to rethink the role of current QNNs and to design novel protocols for solving real-world problems with quantum advantages.
arXiv Detail & Related papers (2021-06-09T10:41:47Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
Trained with the Logistic Loss [0.0]
Neural networks trained to minimize the logistic (a.k.a. cross-entropy) loss with gradient-based methods are observed to perform well in many supervised classification tasks.
We analyze the training and generalization behavior of infinitely wide two-layer neural networks with homogeneous activations.
arXiv Detail & Related papers (2020-02-11T15:42:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.