Related papers: Simplicity Lies in the Eye of the Beholder: A Strategic Perspective on Controllers in Reactive Synthesis

Simplicity Lies in the Eye of the Beholder: A Strategic Perspective on Controllers in Reactive Synthesis

URL: http://arxiv.org/abs/2509.04129v1
Date: Thu, 04 Sep 2025 11:54:19 GMT
Title: Simplicity Lies in the Eye of the Beholder: A Strategic Perspective on Controllers in Reactive Synthesis
Authors: Mickael Randour,
Abstract summary: This contribution focuses on the complexity of strategies in a variety of contexts.<n>We discuss recent results concerning memory and randomness, and take a brief look at what lies beyond our traditional notions of complexity for strategies.
Score: 0.5156484100374059
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the game-theoretic approach to controller synthesis, we model the interaction between a system to be controlled and its environment as a game between these entities, and we seek an appropriate (e.g., winning or optimal) strategy for the system. This strategy then serves as a formal blueprint for a real-world controller. A common belief is that simple (e.g., using limited memory) strategies are better: corresponding controllers are easier to conceive and understand, and cheaper to produce and maintain. This invited contribution focuses on the complexity of strategies in a variety of synthesis contexts. We discuss recent results concerning memory and randomness, and take a brief look at what lies beyond our traditional notions of complexity for strategies.

Related papers

Policy-Conditioned Policies for Multi-Agent Task Solving [53.67744322553693]
In this work, we propose a paradigm shift that bridges the gap by representing policies as human-interpretable source code.<n>We reformulate the learning problem by utilizing Large Language Models (LLMs) as approximate interpreters.<n>We formalize this process as textitProgrammatic Iterated Best Response (PIBR), an algorithm where the policy code is optimized by textual gradients.
arXiv Detail & Related papers (2025-12-24T07:42:10Z)
From Atomic to Composite: Reinforcement Learning Enables Generalization in Complementary Reasoning [83.94543243783285]
We study Complementary Reasoning, a complex task that requires integrating internal parametric knowledge with external contextual information.<n>We find that RL acts as a reasoning synthesizer rather than a probability amplifier.
arXiv Detail & Related papers (2025-12-01T18:27:25Z)
SynthStrategy: Extracting and Formalizing Latent Strategic Insights from LLMs in Organic Chemistry [4.220916808049659]
We introduce a methodology that leverages Large Language Models to distill synthetic knowledge into code.<n>Our system analyzes synthesis routes and translates strategic principles into Python functions representing diverse strategic and tactical rules.<n>This work bridges the tactical-strategic divide in CASP, enabling specification, search, and evaluation of routes by strategic criteria.
arXiv Detail & Related papers (2025-12-01T10:33:00Z)
Experience-Guided Adaptation of Inference-Time Reasoning Strategies [49.954515048847874]
Experience-Guided Reasoner (EGuR) generates tailored strategies at inference time based on accumulated experience.<n>EGuR achieves up to 14% accuracy improvements over the strongest baselines while reducing computational costs by up to 111x.
arXiv Detail & Related papers (2025-11-14T17:45:28Z)
EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning [69.55982246413046]
We propose explicit policy optimization (EPO) for strategic reasoning.<n>We train the strategic reasoning model via multi-turn reinforcement learning (RL),utilizing process rewards and iterative self-play.<n>Our findings reveal various collaborative reasoning mechanisms emergent in EPO and its effectiveness in generating novel strategies.
arXiv Detail & Related papers (2025-02-18T03:15:55Z)
Preference-based opponent shaping in differentiable games [3.373994463906893]
We propose a novel Preference-based Opponent Shaping (PBOS) method to enhance the strategy learning process by shaping agents' preferences towards cooperation.<n>We verify the performance of PBOS algorithm in a variety of differentiable games.
arXiv Detail & Related papers (2024-12-04T06:49:21Z)
Towards a Unified View of Preference Learning for Large Language Models: A Survey [88.66719962576005]
Large Language Models (LLMs) exhibit remarkably powerful capabilities. One of the crucial factors to achieve success is aligning the LLM's output with human preferences. We decompose all the strategies in preference learning into four components: model, data, feedback, and algorithm.
arXiv Detail & Related papers (2024-09-04T15:11:55Z)
K-Level Reasoning: Establishing Higher Order Beliefs in Large Language Models for Strategic Reasoning [76.3114831562989]
It requires Large Language Model (LLM) agents to adapt their strategies dynamically in multi-agent environments. We propose a novel framework: "K-Level Reasoning with Large Language Models (K-R)"
arXiv Detail & Related papers (2024-02-02T16:07:05Z)
Strategic Reasoning with Language Models [35.63300060111918]
Strategic reasoning enables agents to cooperate, communicate, and compete with other agents in diverse situations. Existing approaches to solving strategic games rely on extensive training, yielding strategies that do not generalize to new scenarios or games without retraining. This paper introduces an approach that uses pretrained Large Language Models with few-shot chain-of-thought examples to enable strategic reasoning for AI agents.
arXiv Detail & Related papers (2023-05-30T16:09:19Z)
Co-Learning Empirical Games and World Models [23.800790782022222]
Empirical games drive world models toward a broader consideration of possible game dynamics. World models guide empirical games to efficiently discover new strategies through planning. A new algorithm, Dyna-PSRO, co-learns an empirical game and a world model.
arXiv Detail & Related papers (2023-05-23T16:37:21Z)
Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients. We model players' strategies using artificial neural networks. This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z)
Exploration Policies for On-the-Fly Controller Synthesis: A Reinforcement Learning Approach [0.0]
We propose a new method for obtaining unboundeds based on Reinforcement Learning (RL) Our agents learn from scratch in a highly observable partially RL task and outperform existing overall, in instances unseen during training.
arXiv Detail & Related papers (2022-10-07T20:28:25Z)
Game Theoretic Rating in N-player general-sum games with Equilibria [26.166859475522106]
We propose novel algorithms suitable for N-player, general-sum rating of strategies in normal-form games according to the payoff rating system. This enables well-established solution concepts, such as equilibria, to be leveraged to efficiently rate strategies in games with complex strategic interactions.
arXiv Detail & Related papers (2022-10-05T12:33:03Z)
Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay. During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well. Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.