Related papers: AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions

AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions

URL: http://arxiv.org/abs/2602.06008v1
Date: Thu, 05 Feb 2026 18:50:36 GMT
Title: AgenticPay: A Multi-Agent LLM Negotiation System for Buyer-Seller Transactions
Authors: Xianyang Liu, Shangding Gu, Dawn Song,
Abstract summary: Large language model (LLM)-based agents are increasingly expected to negotiate, coordinate, and transact autonomously.<n>We introduce AgenticPay, a benchmark and simulation framework for multi-agent buyer-seller negotiation driven by natural language.
Score: 49.49718899185783
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language model (LLM)-based agents are increasingly expected to negotiate, coordinate, and transact autonomously, yet existing benchmarks lack principled settings for evaluating language-mediated economic interaction among multiple agents. We introduce AgenticPay, a benchmark and simulation framework for multi-agent buyer-seller negotiation driven by natural language. AgenticPay models markets in which buyers and sellers possess private constraints and product-dependent valuations, and must reach agreements through multi-round linguistic negotiation rather than numeric bidding alone. The framework supports a diverse suite of over 110 tasks ranging from bilateral bargaining to many-to-many markets, with structured action extraction and metrics for feasibility, efficiency, and welfare. Benchmarking state-of-the-art proprietary and open-weight LLMs reveals substantial gaps in negotiation performance and highlights challenges in long-horizon strategic reasoning, establishing AgenticPay as a foundation for studying agentic commerce and language-based market interaction. Code and dataset are available at the link: https://github.com/SafeRL-Lab/AgenticPay.

Related papers

MARTI-MARS$^2$: Scaling Multi-Agent Self-Search via Reinforcement Learning for Code Generation [64.2621682259008]
Multi-Agent Reinforced Training and Inference Framework with Self-Search Scaling (MARTI-MARS2)<n>We propose a Multi-Agent Reinforced Training and Inference Framework with Self-Search Scaling (MARTI-MARS2) to integrate policy learning with multi-agent tree search.<n>We show that MARTI-MARS2 achieves 77.7%, outperforming strong baselines like GPT-5.1 on challenging code generation benchmarks.
arXiv Detail & Related papers (2026-02-08T07:28:44Z)
When Agents Trade: Live Multi-Market Trading Benchmark for LLM Agents [74.55061622246824]
Agent Market Arena (AMA) is the first lifelong, real-time benchmark for evaluating Large Language Model (LLM)-based trading agents.<n>AMA integrates verified trading data, expert-checked news, and diverse agent architectures within a unified trading framework.<n>It evaluates agents across GPT-4o, GPT-4.1, Claude-3.5-haiku, Claude-sonnet-4, and Gemini-2.0-flash.
arXiv Detail & Related papers (2025-10-13T17:54:09Z)
PartnerMAS: An LLM Hierarchical Multi-Agent Framework for Business Partner Selection on High-Dimensional Features [23.788838112113257]
We propose a hierarchical multi-agent framework that decomposes evaluation into three layers: a Planner Agent that designs strategies, Specialized Agents that perform role-specific assessments, and a Supervisor Agent that integrates their outputs.<n>Across 140 cases, PartnerMAS consistently outperforms single-agent and debate-based multi-agent baselines, achieving up to 10--15% higher match rates.
arXiv Detail & Related papers (2025-09-28T19:39:03Z)
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents [59.825725526176655]
Large Language Models (LLMs) have shown remarkable capabilities as autonomous agents.<n>Existing benchmarks either focus on single-agent tasks or are confined to narrow domains, failing to capture the dynamics of multi-agent coordination and competition.<n>We introduce MultiAgentBench, a benchmark designed to evaluate LLM-based multi-agent systems across diverse, interactive scenarios.
arXiv Detail & Related papers (2025-03-03T05:18:50Z)
TradingAgents: Multi-Agents LLM Financial Trading Framework [4.293484524693143]
TradingAgents proposes a novel stock trading framework inspired by trading firms.<n>It features LLM-powered agents in specialized roles such as fundamental analysts, sentiment analysts, technical analysts, and traders with varied risk profiles.<n>By simulating a dynamic, collaborative trading environment, this framework aims to improve trading performance.
arXiv Detail & Related papers (2024-12-28T12:54:06Z)
COMMA: A Communicative Multimodal Multi-Agent Benchmark [15.329501174451677]
We introduce COMMA: a novel puzzle benchmark designed to evaluate the collaborative performance of multimodal multi-agent systems.<n>Our findings reveal surprising weaknesses in state-of-the-art models, including strong proprietary models like GPT-4o and reasoning models like o4-mini.<n>Many chain of thought reasoning models such as R1-Onevision and LLaVA-CoT struggle to outperform even a random baseline in agent-agent collaboration.
arXiv Detail & Related papers (2024-10-10T02:49:47Z)
Assistive Large Language Model Agents for Socially-Aware Negotiation Dialogues [47.977032883078664]
We develop assistive agents based on Large Language Models (LLMs) that aid interlocutors in business negotiations.<n>A third LLM acts as a remediator agent to rewrite utterances violating norms for improving negotiation outcomes.<n>We provide rich empirical evidence to demonstrate its effectiveness in negotiations across three different negotiation topics.
arXiv Detail & Related papers (2024-01-29T09:07:40Z)
Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation [52.930183136111864]
We propose using scorable negotiation to evaluate Large Language Models (LLMs) To reach an agreement, agents must have strong arithmetic, inference, exploration, and planning capabilities. We provide procedures to create new games and increase games' difficulty to have an evolving benchmark.
arXiv Detail & Related papers (2023-09-29T13:33:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.