Solving Neural Min-Max Games: The Role of Architecture, Initialization & Dynamics
- URL: http://arxiv.org/abs/2512.00389v1
- Date: Sat, 29 Nov 2025 08:37:19 GMT
- Title: Solving Neural Min-Max Games: The Role of Architecture, Initialization & Dynamics
- Authors: Deep Patel, Emmanouil-Vasileios Vlatakis-Gkaragkounis,
- Abstract summary: Many emerging applications can be framed as zero-sum games between neural training adversarial, AI alignment, and robust equilibria (NE) two-layer networks.<n>In this paper, we show that a hidden convex-conization hold with high probability under neural matrix theory.<n>This is the first such result for games that twolayer min-max networks.
- Score: 4.9757343270143854
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many emerging applications - such as adversarial training, AI alignment, and robust optimization - can be framed as zero-sum games between neural nets, with von Neumann-Nash equilibria (NE) capturing the desirable system behavior. While such games often involve non-convex non-concave objectives, empirical evidence shows that simple gradient methods frequently converge, suggesting a hidden geometric structure. In this paper, we provide a theoretical framework that explains this phenomenon through the lens of hidden convexity and overparameterization. We identify sufficient conditions - spanning initialization, training dynamics, and network width - that guarantee global convergence to a NE in a broad class of non-convex min-max games. To our knowledge, this is the first such result for games that involve two-layer neural networks. Technically, our approach is twofold: (a) we derive a novel path-length bound for the alternating gradient descent-ascent scheme in min-max games; and (b) we show that the reduction from a hidden convex-concave geometry to two-sided Polyak-Łojasiewicz (PŁ) min-max condition hold with high probability under overparameterization, using tools from random matrix theory.
Related papers
- Diagonalizing the Softmax: Hadamard Initialization for Tractable Cross-Entropy Dynamics [29.85277126753054]
Cross-entropy (CE) loss dominates deep learning, yet existing theory often relies on simplifications.<n>We provide an in-depth characterization of a canonical network with standard neural-basis vectors.
arXiv Detail & Related papers (2025-12-03T17:45:09Z) - Scalable Quantum Walk-Based Heuristics for the Minimum Vertex Cover Problem [0.0]
We propose a novel quantum algorithm for the Minimum Vertex Cover (MVC) problem based on continuous-time quantum walks (CTQWs)<n>In this framework, the coherent propagation of a quantum walker over a graph encodes its structural properties into state amplitudes.<n>We show that the CTQW-based algorithm consistently achieves superior approximation ratios and exhibits remarkable robustness with respect to network topology.
arXiv Detail & Related papers (2025-12-02T17:04:57Z) - Unrolled Graph Neural Networks for Constrained Optimization [83.29547301151177]
We study the dynamics of the dual ascent algorithm in two coupled graph neural networks (GNNs)<n>We propose a joint training scheme that alternates between updating the primal and dual networks.<n>Our numerical experiments demonstrate that our approach yields near-optimal near-feasible solutions.
arXiv Detail & Related papers (2025-09-21T16:55:41Z) - Deep Loss Convexification for Learning Iterative Models [11.36644967267829]
Iterative methods such as iterative closest point (ICP) for point cloud registration often suffer from bad local optimality.
We propose learning to form a convex landscape around each ground truth.
arXiv Detail & Related papers (2024-11-16T01:13:04Z) - The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - Can Decentralized Stochastic Minimax Optimization Algorithms Converge
Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning.
This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z) - A Unified Algebraic Perspective on Lipschitz Neural Networks [88.14073994459586]
This paper introduces a novel perspective unifying various types of 1-Lipschitz neural networks.
We show that many existing techniques can be derived and generalized via finding analytical solutions of a common semidefinite programming (SDP) condition.
Our approach, called SDP-based Lipschitz Layers (SLL), allows us to design non-trivial yet efficient generalization of convex potential layers.
arXiv Detail & Related papers (2023-03-06T14:31:09Z) - Iterative Minimax Games with Coupled Linear Constraints [3.4683309709605328]
The bounds of handling nonmax mini games has gained significant momentum in machine learning communities.<n>We provide the first analysis for this class of games addressing the critical critical adversarial strategy.<n>We show that -2
arXiv Detail & Related papers (2022-12-09T05:32:30Z) - Representation Learning for General-sum Low-rank Markov Games [63.119870889883224]
We study multi-agent general-sum Markov games with nonlinear function approximation.
We focus on low-rank Markov games whose transition matrix admits a hidden low-rank structure on top of an unknown non-linear representation.
arXiv Detail & Related papers (2022-10-30T22:58:22Z) - LEAD: Min-Max Optimization from a Physical Perspective [11.808346987640855]
Adversarial formulations such as generative adversarial networks (GANs) have rekindled interest in two-player min-max games.
A central obstacle in the optimization of such games is the rotational dynamics that hinder their convergence.
We show that game optimization shares dynamic properties with particle systems subject to multiple forces, and one can leverage tools from physics to improve dynamics.
arXiv Detail & Related papers (2020-10-26T19:01:49Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Solving Non-Convex Non-Differentiable Min-Max Games using Proximal
Gradient Method [10.819408603463426]
Min-max saddle point descenders appear in a wide range of applications in machine leaning and signal processing.
We show that a simple multi-step LA-ascent algorithm is stronger than existing ones in the literature.
arXiv Detail & Related papers (2020-03-18T08:38:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.