Reinforcing Competitive Multi-Agents for Playing 'So Long Sucker'
- URL: http://arxiv.org/abs/2411.11057v2
- Date: Tue, 14 Oct 2025 19:11:58 GMT
- Title: Reinforcing Competitive Multi-Agents for Playing 'So Long Sucker'
- Authors: Medant Sharan, Chandranath Adak,
- Abstract summary: This paper investigates the strategy game So Long Sucker (SLS) as a novel benchmark for multi-agent reinforcement learning (MARL)<n>We introduce the first publicly available computational framework for SLS, complete with a graphical user interface and benchmarking support for reinforcement learning algorithms.
- Score: 0.12234742322758417
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the strategy game So Long Sucker (SLS) as a novel benchmark for multi-agent reinforcement learning (MARL). Unlike traditional board or video game testbeds, SLS is distinguished by its coalition formation, strategic deception, and dynamic elimination rules, making it a uniquely challenging environment for autonomous agents. We introduce the first publicly available computational framework for SLS, complete with a graphical user interface and benchmarking support for reinforcement learning algorithms. Using classical deep reinforcement learning methods (e.g., DQN, DDQN, and Dueling DQN), we train self-playing agents to learn the rules and basic strategies of SLS. Experimental results demonstrate that, although these agents achieve roughly half of the maximum attainable reward and consistently outperform random baselines, they require long training horizons (~2000 games) and still commit occasional illegal moves, highlighting both the promise and limitations of classical reinforcement learning. Our findings establish SLS as a negotiation-aware benchmark for MARL, opening avenues for future research that integrates game-theoretic reasoning, coalition-aware strategies, and advanced reinforcement learning architectures to better capture the social and adversarial dynamics of complex multi-agent games.
Related papers
- GIFT: Games as Informal Training for Generalizable LLMs [64.47890325824763]
Large Language Models (LLMs) struggle with "practical wisdom" and generalizable intelligence.<n>This gap arises from a lack of informal learning, which thrives on interactive feedback rather than goal-oriented instruction.<n>We propose treating Games as a primary environment for LLM informal learning, leveraging their intrinsic reward signals and abstracted complexity.
arXiv Detail & Related papers (2026-01-09T08:42:44Z) - MARS: Reinforcing Multi-Agent Reasoning of LLMs through Self-Play in Strategic Games [30.876486250077956]
We introduce MARS, an end-to-end RL framework that incentivizes Multi-Agent Reasoning of LLMs through Self-play in both cooperative and competitive games.<n>Our results establish end-to-end RL training with self-play in strategic games as a powerful approach for developing generalizable multi-agent reasoning capabilities in LLMs.
arXiv Detail & Related papers (2025-10-17T08:08:06Z) - PillagerBench: Benchmarking LLM-Based Agents in Competitive Minecraft Team Environments [48.892997022500765]
We introduce PillagerBench, a framework for evaluating multi-agent systems in real-time competitive team-vs-team scenarios in Minecraft.<n>We also propose TactiCrafter, an LLM-based multi-agent system that facilitates teamwork through human-readable tactics.<n>Our evaluation demonstrates that TactiCrafter outperforms baseline approaches and showcases adaptive learning through self-play.
arXiv Detail & Related papers (2025-09-07T22:51:12Z) - Society of Mind Meets Real-Time Strategy: A Hierarchical Multi-Agent Framework for Strategic Reasoning [16.35236123729838]
We propose a hierarchical multi-agent framework that employs specialized imitation learning agents under a meta-controller called Strategic Planner (SP)<n>By expert demonstrations, each specialized agent learns a distinctive strategy, such as aerial support or defensive maneuvers, and produces coherent, structured multistep action sequences.<n>The SP then orchestrates these proposals into a single, environmentally adaptive plan that ensures local decisions align with long-term strategies.
arXiv Detail & Related papers (2025-08-08T05:57:12Z) - Who is a Better Player: LLM against LLM [53.46608216197315]
We propose an adversarial benchmarking framework to assess the comprehensive performance of Large Language Models (LLMs) through board games competition.<n>We introduce Qi Town, a specialized evaluation platform that supports 5 widely played games and involves 20 LLM-driven players.
arXiv Detail & Related papers (2025-08-05T06:41:47Z) - Agents of Change: Self-Evolving LLM Agents for Strategic Planning [28.172006841163938]
HexMachina is a continual learning multi-agent system that separates environment discovery from strategy improvement.<n>In controlled Catanatron experiments, HexMachina learns from scratch and evolves players that outperform the strongest human-crafted baseline.
arXiv Detail & Related papers (2025-06-05T05:45:24Z) - Strategist: Self-improvement of LLM Decision Making via Bi-Level Tree Search [32.657454056329875]
Large language models (LLMs) exhibit strong generalization and zero-shot capabilities, but struggle with tasks that require detailed planning and decision-making.<n>We introduce STRATEGIST, a novel approach that integrates the strengths of both methods.<n>We demonstrate the effectiveness of STRATEGIST in learning optimal strategies for competitive, multi-turn games with partial information.
arXiv Detail & Related papers (2024-08-20T08:22:04Z) - FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning [25.857375787748715]
We present FightLadder, a real-time fighting game platform, to empower competitive MARL research.
We provide implementations of state-of-the-art MARL algorithms for competitive games, as well as a set of evaluation metrics.
We demonstrate the feasibility of this platform by training a general agent that consistently defeats 12 built-in characters in single-player mode.
arXiv Detail & Related papers (2024-06-04T08:04:23Z) - K-Level Reasoning: Establishing Higher Order Beliefs in Large Language Models for Strategic Reasoning [76.3114831562989]
It requires Large Language Model (LLM) agents to adapt their strategies dynamically in multi-agent environments.
We propose a novel framework: "K-Level Reasoning with Large Language Models (K-R)"
arXiv Detail & Related papers (2024-02-02T16:07:05Z) - Two-Step Reinforcement Learning for Multistage Strategy Card Game [0.0]
This study introduces a two-step reinforcement learning (RL) strategy tailored for "The Lord of the Rings: The Card Game (LOTRCG)"
This research diverges from conventional RL methods by adopting a phased learning approach.
The paper also explores a multi-agent system, where distinct RL agents are employed for various decision-making aspects of the game.
arXiv Detail & Related papers (2023-11-29T01:31:21Z) - Leveraging Reward Consistency for Interpretable Feature Discovery in
Reinforcement Learning [69.19840497497503]
It is argued that the commonly used action matching principle is more like an explanation of deep neural networks (DNNs) than the interpretation of RL agents.
We propose to consider rewards, the essential objective of RL agents, as the essential objective of interpreting RL agents.
We verify and evaluate our method on the Atari 2600 games as well as Duckietown, a challenging self-driving car simulator environment.
arXiv Detail & Related papers (2023-09-04T09:09:54Z) - Benchmarking Robustness and Generalization in Multi-Agent Systems: A
Case Study on Neural MMO [50.58083807719749]
We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions.
This competition targets robustness and generalization in multi-agent systems.
We will open-source our benchmark including the environment wrapper, baselines, a visualization tool, and selected policies for further research.
arXiv Detail & Related papers (2023-08-30T07:16:11Z) - Mastering the Game of No-Press Diplomacy via Human-Regularized
Reinforcement Learning and Planning [95.78031053296513]
No-press Diplomacy is a complex strategy game involving both cooperation and competition.
We introduce a planning algorithm we call DiL-piKL that regularizes a reward-maximizing policy toward a human imitation-learned policy.
We show that DiL-piKL can be extended into a self-play reinforcement learning algorithm we call RL-DiL-piKL.
arXiv Detail & Related papers (2022-10-11T14:47:35Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Learning Monopoly Gameplay: A Hybrid Model-Free Deep Reinforcement
Learning and Imitation Learning Approach [31.066718635447746]
Reinforcement Learning (RL) relies on an agent interacting with an environment to maximize the cumulative sum of rewards received by it.
In multi-player Monopoly game, players have to make several decisions every turn which involves complex actions, such as making trades.
This paper introduces a Hybrid Model-Free Deep RL (DRL) approach that is capable of playing and learning winning strategies of Monopoly.
arXiv Detail & Related papers (2021-03-01T01:40:02Z) - DeepCrawl: Deep Reinforcement Learning for Turn-based Strategy Games [137.86426963572214]
We introduce DeepCrawl, a fully-playable Roguelike prototype for iOS and Android in which all agents are controlled by policy networks trained using Deep Reinforcement Learning (DRL)
Our aim is to understand whether recent advances in DRL can be used to develop convincing behavioral models for non-player characters in videogames.
arXiv Detail & Related papers (2020-12-03T13:53:29Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z) - The NetHack Learning Environment [79.06395964379107]
We present the NetHack Learning Environment (NLE), a procedurally generated rogue-like environment for Reinforcement Learning research.
We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL.
We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration.
arXiv Detail & Related papers (2020-06-24T14:12:56Z) - Distributed Reinforcement Learning for Cooperative Multi-Robot Object
Manipulation [53.262360083572005]
We consider solving a cooperative multi-robot object manipulation task using reinforcement learning (RL)
We propose two distributed multi-agent RL approaches: distributed approximate RL (DA-RL) and game-theoretic RL (GT-RL)
Although we focus on a small system of two agents in this paper, both DA-RL and GT-RL apply to general multi-agent systems, and are expected to scale well to large systems.
arXiv Detail & Related papers (2020-03-21T00:43:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.