Related papers: ASTRA: A Negotiation Agent with Adaptive and Strategic Reasoning through Action in Dynamic Offer Optimization

ASTRA: A Negotiation Agent with Adaptive and Strategic Reasoning through Action in Dynamic Offer Optimization

URL: http://arxiv.org/abs/2503.07129v1
Date: Mon, 10 Mar 2025 09:57:50 GMT
Title: ASTRA: A Negotiation Agent with Adaptive and Strategic Reasoning through Action in Dynamic Offer Optimization
Authors: Deuksin Kwon, Jiwon Hae, Emma Clift, Daniel Shamsoddini, Jonathan Gratch, Gale M. Lucas,
Abstract summary: Negotiation requires dynamically balancing self-interest and cooperation to maximize one's own utility.<n>We introduce principle-driven negotiation agents, powered by ASTRA, a novel framework for turn-level offer optimization.<n>ASTRA operates in three stages: (1) interpreting counterpart behavior, (2) optimizing counteroffers via a linear programming (LP) solver, and (3) selecting offers based on negotiation tactics and the partner's acceptance probability.
Score: 3.5844764276701726
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Negotiation requires dynamically balancing self-interest and cooperation to maximize one's own utility. Yet, existing agents struggle due to bounded rationality in human data, low adaptability to counterpart behavior, and limited strategic reasoning. To address this, we introduce principle-driven negotiation agents, powered by ASTRA, a novel framework for turn-level offer optimization grounded in two core principles: opponent modeling and Tit-for-Tat reciprocity. ASTRA operates in three stages: (1) interpreting counterpart behavior, (2) optimizing counteroffers via a linear programming (LP) solver, and (3) selecting offers based on negotiation tactics and the partner's acceptance probability. Through simulations and human evaluations, our agent effectively adapts to an opponent's shifting stance and achieves favorable outcomes through enhanced adaptability and strategic reasoning. Beyond improving negotiation performance, it also serves as a powerful coaching tool, offering interpretable strategic feedback and optimal offer recommendations.

Related papers

SAND: Boosting LLM Agents with Self-Taught Action Deliberation [53.732649189709285]
Large Language Model (LLM) agents are commonly tuned with supervised finetuning on ReAct-style expert trajectories or preference optimization over pairwise rollouts.<n>We propose Self-taught ActioN Deliberation (SAND) framework, enabling LLM agents to explicitly deliberate over candidate actions before committing to one.<n>SAND achieves an average 20% improvement over initial supervised finetuning and also outperforms state-of-the-art agent tuning approaches.
arXiv Detail & Related papers (2025-07-10T05:38:15Z)
Learning to Lead: Incentivizing Strategic Agents in the Dark [50.93875404941184]
We study an online learning version of the generalized principal-agent model.<n>We develop the first provably sample-efficient algorithm for this challenging setting.<n>We establish a near optimal $tildeO(sqrtT) $ regret bound for learning the principal's optimal policy.
arXiv Detail & Related papers (2025-06-10T04:25:04Z)
LLM Agents for Bargaining with Utility-based Feedback [23.357706450282002]
We introduce a comprehensive framework centered on utility-based feedback.<n>Our contributions are threefold: (1) BargainArena, a novel benchmark dataset; (2) human-aligned, economically-grounded evaluation metrics inspired by utility theory; and (3) a structured feedback mechanism enabling LLMs to iteratively refine their bargaining strategies.
arXiv Detail & Related papers (2025-05-29T02:07:27Z)
EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning [69.55982246413046]
We propose explicit policy optimization (EPO) for strategic reasoning.<n>EPO provides strategies in open-ended action space and can be plugged into arbitrary LLM agents to motivate goal-directed behavior.<n> Experiments across social and physical domains demonstrate EPO's ability of long-term goal alignment.
arXiv Detail & Related papers (2025-02-18T03:15:55Z)
Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment [69.11529841118671]
We propose a new Deliberative Recommendation task, which incorporates explicit reasoning about user preferences as an additional alignment goal. We then introduce the Reasoning-powered Recommender framework for deliberative user preference alignment.
arXiv Detail & Related papers (2025-02-04T07:17:54Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Toward Optimal LLM Alignments Using Two-Player Games [86.39338084862324]
In this paper, we investigate alignment through the lens of two-agent games, involving iterative interactions between an adversarial and a defensive agent. We theoretically demonstrate that this iterative reinforcement learning optimization converges to a Nash Equilibrium for the game induced by the agents. Experimental results in safety scenarios demonstrate that learning in such a competitive environment not only fully trains agents but also leads to policies with enhanced generalization capabilities for both adversarial and defensive agents.
arXiv Detail & Related papers (2024-06-16T15:24:50Z)
Targeted Data Acquisition for Evolving Negotiation Agents [6.953246373478702]
Successful negotiators must learn how to balance optimizing for self-interest and cooperation. Current artificial negotiation agents often heavily depend on the quality of the static datasets they were trained on. We introduce a targeted data acquisition framework where we guide the exploration of a reinforcement learning agent.
arXiv Detail & Related papers (2021-06-14T19:45:59Z)
Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts [52.844741540236285]
This paper investigates the model-based methods in multi-agent reinforcement learning (MARL) We propose a novel decentralized model-based MARL method, named Adaptive Opponent-wise Rollout Policy (AORPO)
arXiv Detail & Related papers (2021-05-07T16:20:22Z)
An Autonomous Negotiating Agent Framework with Reinforcement Learning Based Strategies and Adaptive Strategy Switching Mechanism [3.4376560669160394]
This work focuses on solving the problem of expert selection and adapting to the opponent's behaviour with our Autonomous Negotiating Agent Framework. Our framework has a reviewer component which enables self-enhancement capability by deciding to include new strategies or replace old ones with better strategies periodically.
arXiv Detail & Related papers (2021-02-06T14:38:03Z)
Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution. We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z)
Learnable Strategies for Bilateral Agent Negotiation over Multiple Issues [6.12762193927784]
We present a novel bilateral negotiation model that allows a self-interested agent to learn how to negotiate over multiple issues. The model relies upon interpretable strategy templates representing the tactics the agent should employ during the negotiation. It learns template parameters to maximize the average utility received over multiple negotiations, thus resulting in optimal bid acceptance and generation.
arXiv Detail & Related papers (2020-09-17T13:52:18Z)
Automated Configuration of Negotiation Strategies [0.0]
Bidding and acceptance strategies have a substantial impact on the outcome of negotiations in scenarios with linear additive and nonlinear utility functions. We develop a method leveraging automated algorithm configuration to find the best strategies for a specific set of negotiation settings. We show that our automatically configured agent outperforms all other agents, with a 5.1% increase in negotiation payoff compared to the next-best agent.
arXiv Detail & Related papers (2020-03-31T20:31:33Z)
A Deep Reinforcement Learning Approach to Concurrent Bilateral Negotiation [6.484413431061962]
We present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed.
arXiv Detail & Related papers (2020-01-31T12:05:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.