Game-Theoretic Gradient Control for Robust Neural Network Training
- URL: http://arxiv.org/abs/2507.19143v1
- Date: Fri, 25 Jul 2025 10:26:25 GMT
- Title: Game-Theoretic Gradient Control for Robust Neural Network Training
- Authors: Maria Zaitseva, Ivan Tomilov, Natalia Gusarova,
- Abstract summary: Feed-forward neural networks (FFNNs) are vulnerable to input noise, reducing prediction performance.<n>This study aims to enhance FFNN noise robustness by modifying backpropagation, interpreted as a multi-agent game, and exploring controlled target variable noising.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Feed-forward neural networks (FFNNs) are vulnerable to input noise, reducing prediction performance. Existing regularization methods like dropout often alter network architecture or overlook neuron interactions. This study aims to enhance FFNN noise robustness by modifying backpropagation, interpreted as a multi-agent game, and exploring controlled target variable noising. Our "gradient dropout" selectively nullifies hidden layer neuron gradients with probability 1 - p during backpropagation, while keeping forward passes active. This is framed within compositional game theory. Additionally, target variables were perturbed with white noise or stable distributions. Experiments on ten diverse tabular datasets show varying impacts: improvement or diminishing of robustness and accuracy, depending on dataset and hyperparameters. Notably, on regression tasks, gradient dropout (p = 0.9) combined with stable distribution target noising significantly increased input noise robustness, evidenced by flatter MSE curves and more stable SMAPE values. These results highlight the method's potential, underscore the critical role of adaptive parameter tuning, and open new avenues for analyzing neural networks as complex adaptive systems exhibiting emergent behavior within a game-theoretic framework.
Related papers
- Variational Adaptive Noise and Dropout towards Stable Recurrent Neural Networks [6.454374656250522]
This paper proposes a novel stable learning theory for recurrent neural networks (RNNs)<n>We show that noise and dropout can be derived simultaneously by transforming the explicit regularization term into implicit regularization.<n>Their scale and ratio can also be adjusted appropriately to optimize the main objective of RNNs, respectively.
arXiv Detail & Related papers (2025-06-02T06:13:22Z) - Stochastic Forward-Forward Learning through Representational Dimensionality Compression [7.847900313045352]
Forward-Forward (FF) algorithm provides a bottom-up alternative to backpropagation (BP) for training neural networks.<n>We propose a novel goodness function termed dimensionality compression that uses the effective dimensionality (ED) of fluctuating neural responses to incorporate second-order statistical structure.
arXiv Detail & Related papers (2025-05-22T13:19:29Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z) - Towards Robust Neural Networks via Close-loop Control [12.71446168207573]
Deep neural networks are vulnerable to various perturbations due to their black-box nature.
Recent study has shown that a deep neural network can misclassify the data even if the input data is perturbed by an imperceptible amount.
arXiv Detail & Related papers (2021-02-03T03:50:35Z) - Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive
Compression [40.35734017517066]
Nested networks or slimmable networks are neural networks whose architectures can be adjusted instantly during testing time.
Recent studies have focused on a "nested dropout" layer, which is able to order the nodes of a layer by importance during training.
arXiv Detail & Related papers (2021-01-27T12:34:58Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z) - Ramifications of Approximate Posterior Inference for Bayesian Deep
Learning in Adversarial and Out-of-Distribution Settings [7.476901945542385]
We show that Bayesian deep learning models on certain occasions marginally outperform conventional neural networks.
Preliminary investigations indicate the potential inherent role of bias due to choices of initialisation, architecture or activation functions.
arXiv Detail & Related papers (2020-09-03T16:58:15Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.