Pontryagin Neural Operator for Solving Parametric General-Sum Differential Games
- URL: http://arxiv.org/abs/2401.01502v2
- Date: Fri, 31 May 2024 21:53:47 GMT
- Title: Pontryagin Neural Operator for Solving Parametric General-Sum Differential Games
- Authors: Lei Zhang, Mukesh Ghimire, Zhe Xu, Wenlong Zhang, Yi Ren,
- Abstract summary: We show that a Pontryagin-mode neural operator outperforms the current state-of-the-art hybrid PINN model on safety performance across games with parametric state constraints.
Our key contribution is the introduction of a costate loss defined on the discrepancy between forward and backward costate rollouts.
We show that the costate dynamics, which can reflect state constraint violation, effectively enables the learning of differentiable values with large Lipschitz constants.
- Score: 24.012924492073974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The values of two-player general-sum differential games are viscosity solutions to Hamilton-Jacobi-Isaacs (HJI) equations. Value and policy approximations for such games suffer from the curse of dimensionality (CoD). Alleviating CoD through physics-informed neural networks (PINN) encounters convergence issues when differentiable values with large Lipschitz constants are present due to state constraints. On top of these challenges, it is often necessary to learn generalizable values and policies across a parametric space of games, e.g., for game parameter inference when information is incomplete. To address these challenges, we propose in this paper a Pontryagin-mode neural operator that outperforms the current state-of-the-art hybrid PINN model on safety performance across games with parametric state constraints. Our key contribution is the introduction of a costate loss defined on the discrepancy between forward and backward costate rollouts, which are computationally cheap. We show that the costate dynamics, which can reflect state constraint violation, effectively enables the learning of differentiable values with large Lipschitz constants, without requiring manually supervised data as suggested by the hybrid PINN model. More importantly, we show that the close relationship between costates and policies makes the former critical in learning feedback control policies with generalizable safety performance.
Related papers
- Auto-Encoding Bayesian Inverse Games [36.06617326128679]
We consider the inverse game problem, in which some properties of the game are unknown a priori.
Existing maximum likelihood estimation approaches to solve inverse games provide only point estimates of unknown parameters.
We take a Bayesian perspective and construct posterior distributions of game parameters.
This structured VAE can be trained from an unlabeled dataset of observed interactions.
arXiv Detail & Related papers (2024-02-14T02:17:37Z) - Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z) - Value Approximation for Two-Player General-Sum Differential Games with State Constraints [24.012924492073974]
Solving Hamilton-Jacobi-Isaacs (HJI) PDEs numerically enables equilibrial feedback control in two-player differential games, yet faces the curse of dimensionality (CoD)
While physics-informed neural networks (PINNs) have shown promise in alleviating CoD in solving PDEs, vanilla PINNs fall short in learning discontinuous solutions due to their sampling nature.
In this study, we explore three potential solutions to this challenge: (1) a hybrid learning method that is guided by both supervisory equilibria and the HJI PDE, (2) a value-hardening method
arXiv Detail & Related papers (2023-11-28T04:58:41Z) - Solving Forward and Inverse Problems of Contact Mechanics using
Physics-Informed Neural Networks [0.0]
We deploy PINNs in a mixed-variable formulation enhanced by output transformation to enforce hard and soft constraints.
We show that PINNs can serve as pure partial equation (PDE) solver, as data-enhanced forward model, and as fast-to-evaluate surrogate model.
arXiv Detail & Related papers (2023-08-24T11:31:24Z) - Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage [100.8180383245813]
We propose value-based algorithms for offline reinforcement learning (RL)
We show an analogous result for vanilla Q-functions under a soft margin condition.
Our algorithms' loss functions arise from casting the estimation problems as nonlinear convex optimization problems and Lagrangifying.
arXiv Detail & Related papers (2023-02-05T14:22:41Z) - Function Approximation for Solving Stackelberg Equilibrium in Large
Perfect Information Games [115.77438739169155]
We propose learning the textitEnforceable Payoff Frontier (EPF) -- a generalization of the state value function for general-sum games.
This is the first method that applies FA to the Stackelberg setting, allowing us to scale to much larger games.
arXiv Detail & Related papers (2022-12-29T19:05:50Z) - Near-Optimal $\Phi$-Regret Learning in Extensive-Form Games [85.78272987312343]
We establish efficient and uncoupled learning dynamics so that the trigger regret of each player grows as $O(log T)$ after $T$ repetitions of play.
This improves exponentially over the prior best known trigger-regret bound of $O(T1/4)$.
arXiv Detail & Related papers (2022-08-20T20:48:58Z) - Approximating Discontinuous Nash Equilibrial Values of Two-Player
General-Sum Differential Games [21.291449080239673]
This paper extends from previous SOTA on zero-sum games with continuous values to general-sum games with discontinuous values, where the discontinuity is caused by that of the players' losses.
We show that due to its lack of convergence proof and generalization analysis on discontinuous losses, the existing self-supervised learning technique fails to generalize and raises safety concerns in an autonomous driving application.
Our solution is to first pre-train the value network on supervised Nash equilibria, and then refine it by minimizing a loss that combines the supervised data with the PDE and boundary conditions.
arXiv Detail & Related papers (2022-07-05T02:22:05Z) - Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption.
They can suffer from ill-posedness and convergence instability.
This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z) - A Variational Inequality Approach to Bayesian Regression Games [90.79402153164587]
We prove the existence of the uniqueness of a class of convex and generalize it to smooth cost functions.
We provide two simple algorithms of solving them with necessarily strong convergence.
arXiv Detail & Related papers (2021-03-24T22:33:11Z) - Fixed Point Networks: Implicit Depth Models with Jacobian-Free Backprop [21.00060644438722]
A growing trend in deep learning replaces fixed depth models by approximations of the limit as network depth approaches infinity.
In particular, backpropagation through implicit depth models requires solving a Jacobian-based equation arising from the implicit function theorem.
We propose fixed point networks (FPNs) that guarantees convergence of forward propagation to a unique limit defined by network weights and input data.
arXiv Detail & Related papers (2021-03-23T19:20:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.