GameTalk: Training LLMs for Strategic Conversation
- URL: http://arxiv.org/abs/2601.16276v1
- Date: Thu, 22 Jan 2026 19:18:39 GMT
- Title: GameTalk: Training LLMs for Strategic Conversation
- Authors: Victor Conchello Vendrell, Max Ruiz Luyten, Mihaela van der Schaar,
- Abstract summary: We introduce textbfGameTalk, a framework for training LLMs to make strategic decisions via multi-turn interactions.<n>Unlike prior work that focuses on single-turn objectives or static action prediction, we train LLMs to optimize a global objective across full conversations.<n>We evaluate this approach on a suite of increasingly complex games, designed to stress different aspects of reasoning, coordination, and opponent modeling.
- Score: 51.29670609281524
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Strategic decision-making in multi-agent settings is a key challenge for large language models (LLMs), particularly when coordination and negotiation must unfold over extended conversations. While recent work has explored the use of LLMs in isolated decision tasks, little attention has been given to optimizing long-term objectives through dialogue. We introduce \textbf{GameTalk}, a framework for training LLMs to make strategic decisions via multi-turn interactions. Unlike prior work that focuses on single-turn objectives or static action prediction, we train LLMs to optimize a global objective across full conversations. We achieve this by adapting fine-tuning methods like GRPO, DPO, and STaR to incorporate reward signals that depend on the entire interaction. We evaluate this approach on a suite of increasingly complex games, designed to stress different aspects of reasoning, coordination, and opponent modeling. Our results show that GameTalk significantly outperforms untrained models, especially under reward shaping, with DPO consistently yielding the strongest gains. These findings position conversational fine-tuning as a promising path for LLMs to reason, negotiate, and act in interactive environments.
Related papers
- LinguaGame: A Linguistically Grounded Game-Theoretic Paradigm for Multi-Agent Dialogue Generation [17.584631586928815]
We propose a linguistically-grounded game-theoretic paradigm for multi-agent dialogue generation.<n>Our framework relies on linguistically informed reasoning with minimal task-specific coupling.<n>We evaluate our framework in simulated courtroom proceedings and debates, with human expert assessments showing significant gains in communication efficiency.
arXiv Detail & Related papers (2026-01-08T02:30:43Z) - OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models [23.174462069020695]
We present OnGoal, an LLM chat interface that helps users better manage goal progress.<n>OnGoal provides real-time feedback on goal alignment through LLM-assisted evaluation.<n>Using OnGoal, participants spent less time and effort to achieve their goals.
arXiv Detail & Related papers (2025-08-28T17:58:29Z) - Planning without Search: Refining Frontier LLMs with Offline Goal-Conditioned RL [62.984693936073974]
Large language models (LLMs) excel in tasks like question answering and dialogue.<n>Complex tasks requiring interaction, such as negotiation and persuasion, require additional long-horizon reasoning and planning.<n>We propose a novel approach that uses goal-conditioned value functions to guide the reasoning of LLM agents.
arXiv Detail & Related papers (2025-05-23T16:51:54Z) - DialogXpert: Driving Intelligent and Emotion-Aware Conversations through Online Value-Based Reinforcement Learning with LLM Priors [19.83349341267686]
Large-language-model (LLM) agents excel at reactive dialogue but struggle with proactive, goal-driven interactions.<n>We introduce DialogXpert, which proposes a small, high-quality set of candidate actions per turn.<n>By tracking the user's emotions, DialogXpert tailors each decision to advance the task while nurturing a genuine, empathetic connection.
arXiv Detail & Related papers (2025-05-23T12:12:40Z) - Playpen: An Environment for Exploring Learning Through Conversational Interaction [84.0413820245725]
We investigate whether Dialogue Games can also serve as a source of feedback signals for learning.<n>We introduce Playpen, an environment for off- and online learning through Dialogue Game self-play.<n>We find that imitation learning through SFT improves performance on unseen instances, but negatively impacts other skills.
arXiv Detail & Related papers (2025-04-11T14:49:33Z) - Are LLMs Effective Negotiators? Systematic Evaluation of the Multifaceted Capabilities of LLMs in Negotiation Dialogues [4.738985706520995]
This work aims to systematically analyze the multifaceted capabilities of LLMs across diverse dialogue scenarios.
Our analysis highlights GPT-4's superior performance in many tasks while identifying specific challenges.
arXiv Detail & Related papers (2024-02-21T06:11:03Z) - Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations [70.7884839812069]
Large language models (LLMs) have emerged as powerful and general solutions to many natural language tasks.
However, many of the most important applications of language generation are interactive, where an agent has to talk to a person to reach a desired outcome.
In this work, we explore a new method for adapting LLMs with RL for such goal-directed dialogue.
arXiv Detail & Related papers (2023-11-09T18:45:16Z) - Plug-and-Play Policy Planner for Large Language Model Powered Dialogue
Agents [121.46051697742608]
We introduce a new dialogue policy planning paradigm to strategize dialogue problems with a tunable language model plug-in named PPDPP.
Specifically, we develop a novel training framework to facilitate supervised fine-tuning over available human-annotated data.
PPDPP consistently and substantially outperforms existing approaches on three different proactive dialogue applications.
arXiv Detail & Related papers (2023-11-01T03:20:16Z) - Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation [52.930183136111864]
We propose using scorable negotiation to evaluate Large Language Models (LLMs)
To reach an agreement, agents must have strong arithmetic, inference, exploration, and planning capabilities.
We provide procedures to create new games and increase games' difficulty to have an evolving benchmark.
arXiv Detail & Related papers (2023-09-29T13:33:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.