Adversarial Policies Beat Superhuman Go AIs
- URL: http://arxiv.org/abs/2211.00241v4
- Date: Thu, 13 Jul 2023 06:37:29 GMT
- Title: Adversarial Policies Beat Superhuman Go AIs
- Authors: Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose,
Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey
Levine, Stuart Russell
- Abstract summary: We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies against it.
Our adversaries do not win by playing Go well. Instead, they trick KataGo into making serious blunders.
Our results demonstrate that even superhuman AI systems may harbor surprising failure modes.
- Score: 54.15639517188804
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We attack the state-of-the-art Go-playing AI system KataGo by training
adversarial policies against it, achieving a >97% win rate against KataGo
running at superhuman settings. Our adversaries do not win by playing Go well.
Instead, they trick KataGo into making serious blunders. Our attack transfers
zero-shot to other superhuman Go-playing AIs, and is comprehensible to the
extent that human experts can implement it without algorithmic assistance to
consistently beat superhuman AIs. The core vulnerability uncovered by our
attack persists even in KataGo agents adversarially trained to defend against
our attack. Our results demonstrate that even superhuman AI systems may harbor
surprising failure modes. Example games are available https://goattack.far.ai/.
Related papers
- Can Go AIs be adversarially robust? [4.466856575755327]
We study whether adding natural countermeasures can achieve robustness in Go.
We find that though some of these defenses protect against previously discovered attacks, none withstand freshly trained adversaries.
Our results suggest that building robust AI systems is challenging even with extremely superhuman systems in some of the most tractable settings.
arXiv Detail & Related papers (2024-06-18T17:57:49Z) - DanZero+: Dominating the GuanDan Game through Reinforcement Learning [95.90682269990705]
We develop an AI program for an exceptionally complex and popular card game called GuanDan.
We first put forward an AI program named DanZero for this game.
In order to further enhance the AI's capabilities, we apply policy-based reinforcement learning algorithm to GuanDan.
arXiv Detail & Related papers (2023-12-05T08:07:32Z) - Are AlphaZero-like Agents Robust to Adversarial Perturbations? [73.13944217915089]
AlphaZero (AZ) has demonstrated that neural-network-based Go AIs can surpass human performance by a large margin.
We ask whether adversarial states exist for Go AIs that may lead them to play surprisingly wrong actions.
We develop the first adversarial attack on Go AIs that can efficiently search for adversarial states by strategically reducing the search space.
arXiv Detail & Related papers (2022-11-07T18:43:25Z) - DanZero: Mastering GuanDan Game with Reinforcement Learning [121.93690719186412]
Card game AI has always been a hot topic in the research of artificial intelligence.
In this paper, we are devoted to developing an AI program for a more complex card game, GuanDan.
We propose the first AI program DanZero for GuanDan using reinforcement learning technique.
arXiv Detail & Related papers (2022-10-31T06:29:08Z) - AI in Games: Techniques, Challenges and Opportunities [40.86375378643978]
Various game AI systems (AIs) have been developed such as Libratus, OpenAI Five and AlphaStar, beating professional human players.
In this paper, we survey recent successful game AIs, covering board game AIs, card game AIs, first-person shooting game AIs and real time strategy game AIs.
arXiv Detail & Related papers (2021-11-15T09:35:53Z) - Instructive artificial intelligence (AI) for human training, assistance,
and explainability [0.24629531282150877]
We show how a neural network might instruct human trainees as an alternative to traditional approaches to explainable AI (XAI)
An AI examines human actions and calculates variations on the human strategy that lead to better performance.
Results will be presented on AI instruction's ability to improve human decision-making and human-AI teaming in Hanabi.
arXiv Detail & Related papers (2021-11-02T16:46:46Z) - The Threat of Offensive AI to Organizations [52.011307264694665]
This survey explores the threat of offensive AI on organizations.
First, we discuss how AI changes the adversary's methods, strategies, goals, and overall attack model.
Then, through a literature review, we identify 33 offensive AI capabilities which adversaries can use to enhance their attacks.
arXiv Detail & Related papers (2021-06-30T01:03:28Z) - Suphx: Mastering Mahjong with Deep Reinforcement Learning [114.68233321904623]
We design an AI for Mahjong, named Suphx, based on deep reinforcement learning with some newly introduced techniques.
Suphx has demonstrated stronger performance than most top human players in terms of stable rank.
This is the first time that a computer program outperforms most top human players in Mahjong.
arXiv Detail & Related papers (2020-03-30T16:18:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.