Adversarial Training as Stackelberg Game: An Unrolled Optimization
Approach
- URL: http://arxiv.org/abs/2104.04886v1
- Date: Sun, 11 Apr 2021 00:44:57 GMT
- Title: Adversarial Training as Stackelberg Game: An Unrolled Optimization
Approach
- Authors: Simiao Zuo, Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He,
Jianfeng Gao, Weizhu Chen, Tuo Zhao
- Abstract summary: Adversarial training has been shown to improve the generalization performance of deep learning models.
We propose Stackelberg Adversarial Training (SALT), which formulates adversarial training as a Stackelberg game.
- Score: 91.74682538906691
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training has been shown to improve the generalization performance
of deep learning models in various natural language processing tasks. Existing
works usually formulate adversarial training as a zero-sum game, which is
solved by alternating gradient descent/ascent algorithms. Such a formulation
treats the adversarial and the defending players equally, which is undesirable
because only the defending player contributes to the generalization
performance. To address this issue, we propose Stackelberg Adversarial Training
(SALT), which formulates adversarial training as a Stackelberg game. This
formulation induces a competition between a leader and a follower, where the
follower generates perturbations, and the leader trains the model subject to
the perturbations. Different from conventional adversarial training, in SALT,
the leader is in an advantageous position. When the leader moves, it recognizes
the strategy of the follower and takes the anticipated follower's outcomes into
consideration. Such a leader's advantage enables us to improve the model
fitting to the unperturbed data. The leader's strategic information is captured
by the Stackelberg gradient, which is obtained using an unrolling algorithm.
Our experimental results on a set of machine translation and natural language
understanding tasks show that SALT outperforms existing adversarial training
baselines across all tasks.
Related papers
- Fast Propagation is Better: Accelerating Single-Step Adversarial
Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples.
We propose to exploit the interior building blocks of the model to improve efficiency.
Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z) - When Should a Leader Act Suboptimally? The Role of Inferability in Repeated Stackelberg Games [28.856644679990357]
We model the inferability problem using Stackelberg games with observations where a leader and a follower repeatedly interact.
For a variety of game settings, we show that the inferability gap is upper-bounded by a function of the number of interactions and theity level of the leader's strategy.
We identify a set of games where the leader's near-optimal strategy may suffer from a large inferability gap.
arXiv Detail & Related papers (2023-09-30T19:08:05Z) - Active Inverse Learning in Stackelberg Trajectory Games [32.663862342494745]
We formulate an inverse learning problem in a Stackelberg game between a leader and a follower.
We propose an active inverse learning method for the leader to infer which hypothesis among a finite set of candidates best describes the follower's objective function.
arXiv Detail & Related papers (2023-08-15T20:17:26Z) - Online Learning in Stackelberg Games with an Omniscient Follower [83.42564921330896]
We study the problem of online learning in a two-player decentralized cooperative Stackelberg game.
In each round, the leader first takes an action, followed by the follower who takes their action after observing the leader's move.
We show that depending on the reward structure, the existence of the omniscient follower may change the sample complexity drastically.
arXiv Detail & Related papers (2023-01-27T03:35:10Z) - No-Regret Learning in Dynamic Stackelberg Games [31.001205916012307]
In a Stackelberg game, a leader commits to a randomized strategy, and a follower chooses their best strategy in response.
We consider an extension of a standard Stackelberg game, called a discrete-time dynamic Stackelberg game, that has an underlying state space that affects the leader's rewards and available strategies and evolves in a Markovian manner depending on both the leader and follower's selected strategies.
arXiv Detail & Related papers (2022-02-10T01:07:57Z) - Self-Training with Differentiable Teacher [80.62757989797095]
Self-training achieves enormous success in various semi-supervised and weakly-supervised learning tasks.
The method can be interpreted as a teacher-student framework, where the teacher generates pseudo-labels, and the student makes predictions.
We propose ours, short for differentiable self-training, that treats teacher-student as a Stackelberg game.
arXiv Detail & Related papers (2021-09-15T02:06:13Z) - Who Leads and Who Follows in Strategic Classification? [82.44386576129295]
We argue that the order of play in strategic classification is fundamentally determined by the relative frequencies at which the decision-maker and the agents adapt to each other's actions.
We show that a decision-maker with the freedom to choose their update frequency can induce learning dynamics that converge to Stackelberg equilibria with either order of play.
arXiv Detail & Related papers (2021-06-23T16:48:46Z) - Optimally Deceiving a Learning Leader in Stackelberg Games [123.14187606686006]
Recent results in the ML community have revealed that learning algorithms used to compute the optimal strategy for the leader to commit to in a Stackelberg game, are susceptible to manipulation by the follower.
This paper shows that it is always possible for the follower to compute (near-optimal) payoffs for various scenarios about the learning interaction between leader and follower.
arXiv Detail & Related papers (2020-06-11T16:18:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.