L2E: Learning to Exploit Your Opponent
- URL: http://arxiv.org/abs/2102.09381v1
- Date: Thu, 18 Feb 2021 14:27:59 GMT
- Title: L2E: Learning to Exploit Your Opponent
- Authors: Zhe Wu, Kai Li, Enmin Zhao, Hang Xu, Meng Zhang, Haobo Fu, Bo An,
Junliang Xing
- Abstract summary: We propose a novel Learning to Exploit framework for implicit opponent modeling.
L2E acquires the ability to exploit opponents by a few interactions with different opponents during training.
We propose a novel opponent strategy generation algorithm that produces effective opponents for training automatically.
- Score: 66.66334543946672
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Opponent modeling is essential to exploit sub-optimal opponents in strategic
interactions. Most previous works focus on building explicit models to directly
predict the opponents' styles or strategies, which require a large amount of
data to train the model and lack adaptability to unknown opponents. In this
work, we propose a novel Learning to Exploit (L2E) framework for implicit
opponent modeling. L2E acquires the ability to exploit opponents by a few
interactions with different opponents during training, thus can adapt to new
opponents with unknown styles during testing quickly. We propose a novel
opponent strategy generation algorithm that produces effective opponents for
training automatically. We evaluate L2E on two poker games and one grid soccer
game, which are the commonly used benchmarks for opponent modeling.
Comprehensive experimental results indicate that L2E quickly adapts to diverse
styles of unknown opponents.
Related papers
- Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena [126.70522244144088]
We introduce Arena Learning, an innovative offline strategy designed to simulate arena battles using AI-driven annotations.
Arena Learning ensures precise evaluations and maintains consistency between offline simulations and online competitions.
We apply Arena Learning to train our target model, WizardLM-$beta$, and demonstrate significant performance enhancements.
arXiv Detail & Related papers (2024-07-15T11:26:07Z) - All by Myself: Learning Individualized Competitive Behaviour with a
Contrastive Reinforcement Learning optimization [57.615269148301515]
In a competitive game scenario, a set of agents have to learn decisions that maximize their goals and minimize their adversaries' goals at the same time.
We propose a novel model composed of three neural layers that learn a representation of a competitive game, learn how to map the strategy of specific opponents, and how to disrupt them.
Our experiments demonstrate that our model achieves better performance when playing against offline, online, and competitive-specific models, in particular when playing against the same opponent multiple times.
arXiv Detail & Related papers (2023-10-02T08:11:07Z) - Know your Enemy: Investigating Monte-Carlo Tree Search with Opponent
Models in Pommerman [14.668309037894586]
In combination with Reinforcement Learning, Monte-Carlo Tree Search has shown to outperform human grandmasters in games such as Chess, Shogi and Go.
We investigate techniques that transform general-sum multiplayer games into single-player and two-player games.
arXiv Detail & Related papers (2023-05-22T16:39:20Z) - Universal Adversarial Training with Class-Wise Perturbations [78.05383266222285]
adversarial training is the most widely used method for defending against adversarial attacks.
In this work, we find that a UAP does not attack all classes equally.
We improve the SOTA UAT by proposing to utilize class-wise UAPs during adversarial training.
arXiv Detail & Related papers (2021-04-07T09:05:49Z) - Yet Meta Learning Can Adapt Fast, It Can Also Break Easily [53.65787902272109]
We study adversarial attacks on meta learning under the few-shot classification problem.
We propose the first attacking algorithm against meta learning under various settings.
arXiv Detail & Related papers (2020-09-02T15:03:14Z) - Learning to Play Sequential Games versus Unknown Opponents [93.8672371143881]
We consider a repeated sequential game between a learner, who plays first, and an opponent who responds to the chosen action.
We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents.
Our results include algorithm's regret guarantees that depend on the regularity of the opponent's response.
arXiv Detail & Related papers (2020-07-10T09:33:05Z) - Enhanced Rolling Horizon Evolution Algorithm with Opponent Model
Learning: Results for the Fighting Game AI Competition [9.75720700239984]
We propose a novel algorithm that combines Rolling Horizon Evolution Algorithm (RHEA) with opponent model learning.
Our proposed bot with the policy-gradient-based opponent model is the only one without using Monte-Carlo Tree Search (MCTS) among top five bots in the 2019 competition.
arXiv Detail & Related papers (2020-03-31T04:44:33Z) - Deep Reinforcement Learning for FlipIt Security Game [2.0624765454705654]
We describe a deep learning model in which agents adapt to different classes of opponents and learn the optimal counter-strategy.
We apply our model to FlipIt, a two-player security game in which both players, the attacker and the defender, compete for ownership of a shared resource.
Our model is a deep neural network combined with Q-learning and is trained to maximize the defender's time of ownership of the resource.
arXiv Detail & Related papers (2020-02-28T18:26:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.