Related papers: TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?

TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?

URL: http://arxiv.org/abs/2512.02261v1
Date: Mon, 01 Dec 2025 23:06:42 GMT
Title: TradeTrap: Are LLM-based Trading Agents Truly Reliable and Faithful?
Authors: Lewen Yan, Jilin Mei, Tianyi Zhou, Lige Huang, Jie Zhang, Dongrui Liu, Jing Shao,
Abstract summary: TradeTrap is a unified evaluation framework for systematically stress-testing both adaptive and procedural autonomous trading agents.<n>It targets four core components of autonomous trading agents: market intelligence, strategy formulation, portfolio and ledger handling, and trade execution.<n>Experiments show that small perturbations at a single component can propagate through the agent decision loop and induce extreme concentration, runaway exposure, and large portfolio drawdowns.
Score: 44.01987401527335
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: LLM-based trading agents are increasingly deployed in real-world financial markets to perform autonomous analysis and execution. However, their reliability and robustness under adversarial or faulty conditions remain largely unexamined, despite operating in high-risk, irreversible financial environments. We propose TradeTrap, a unified evaluation framework for systematically stress-testing both adaptive and procedural autonomous trading agents. TradeTrap targets four core components of autonomous trading agents: market intelligence, strategy formulation, portfolio and ledger handling, and trade execution, and evaluates their robustness under controlled system-level perturbations. All evaluations are conducted in a closed-loop historical backtesting setting on real US equity market data with identical initial conditions, enabling fair and reproducible comparisons across agents and attacks. Extensive experiments show that small perturbations at a single component can propagate through the agent decision loop and induce extreme concentration, runaway exposure, and large portfolio drawdowns across both agent types, demonstrating that current autonomous trading agents can be systematically misled at the system level. Our code is available at https://github.com/Yanlewen/TradeTrap.

Related papers

Gaming the Judge: Unfaithful Chain-of-Thought Can Undermine Agent Evaluation [76.5533899503582]
Large language models (LLMs) are increasingly used as judges to evaluate agent performance.<n>We show this paradigm implicitly assumes that the agent's chain-of-thought (CoT) reasoning faithfully reflects both its internal reasoning and the underlying environment state.<n>We demonstrate that manipulated reasoning alone can inflate false positive rates of state-of-the-art VLM judges by up to 90% across 800 trajectories spanning diverse web tasks.
arXiv Detail & Related papers (2026-01-21T06:07:43Z)
Robust Reinforcement Learning in Finance: Modeling Market Impact with Elliptic Uncertainty Sets [57.179679246370114]
In financial applications, reinforcement learning (RL) agents are commonly trained on historical data, where their actions do not influence prices.<n>During deployment, these agents trade in live markets where their own transactions can shift asset prices, a phenomenon known as market impact.<n>Traditional robust RL approaches address this model misspecification by optimizing the worst-case performance over a set of uncertainties.<n>We develop a novel class of elliptic uncertainty sets, enabling efficient and tractable robust policy evaluation.
arXiv Detail & Related papers (2025-10-22T18:22:25Z)
When Agents Trade: Live Multi-Market Trading Benchmark for LLM Agents [74.55061622246824]
Agent Market Arena (AMA) is the first lifelong, real-time benchmark for evaluating Large Language Model (LLM)-based trading agents.<n>AMA integrates verified trading data, expert-checked news, and diverse agent architectures within a unified trading framework.<n>It evaluates agents across GPT-4o, GPT-4.1, Claude-3.5-haiku, Claude-sonnet-4, and Gemini-2.0-flash.
arXiv Detail & Related papers (2025-10-13T17:54:09Z)
QuantAgents: Towards Multi-agent Financial System via Simulated Trading [40.636918662488505]
QuantAgents is a multi-agent system integrating simulated trading.<n> QuantAgents comprises four agents: a simulated trading analyst, a risk control analyst, a market news analyst, and a manager.<n>Our system incentivizes agents to receive feedback on two fronts: performance in real-world markets and predictive accuracy in simulated trading.
arXiv Detail & Related papers (2025-10-06T09:45:57Z)
TradingGroup: A Multi-Agent Trading System with Self-Reflection and Data-Synthesis [15.865159423176982]
TradingGroup is a multi-agent trading system designed to address limitations through a self-reflective architecture and an end-to-end data-synthesis pipeline.<n> TradingGroup consists of specialized agents for news sentiment analysis, financial report interpretation, stock trend forecasting, trading style adaptation, and a trading decision making agent.<n>Specifically, we design self-reflection mechanisms for the stock forecasting, style, and decision-making agents to distill past successes and failures for similar reasoning in analogous future scenarios.
arXiv Detail & Related papers (2025-08-25T00:29:58Z)
Agent Trading Arena: A Study on Numerical Understanding in LLM-Based Agents [69.58565132975504]
Large language models (LLMs) have demonstrated remarkable capabilities in natural language tasks.<n>We present the Agent Trading Arena, a virtual zero-sum stock market in which LLM-based agents engage in competitive multi-agent trading.
arXiv Detail & Related papers (2025-02-25T08:41:01Z)
TradingAgents: Multi-Agents LLM Financial Trading Framework [4.293484524693143]
TradingAgents proposes a novel stock trading framework inspired by trading firms.<n>It features LLM-powered agents in specialized roles such as fundamental analysts, sentiment analysts, technical analysts, and traders with varied risk profiles.<n>By simulating a dynamic, collaborative trading environment, this framework aims to improve trading performance.
arXiv Detail & Related papers (2024-12-28T12:54:06Z)
When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments [55.19252983108372]
We have developed a multi-agent AI system called StockAgent, driven by LLMs. The StockAgent allows users to evaluate the impact of different external factors on investor trading. It avoids the test set leakage issue present in existing trading simulation systems based on AI Agents.
arXiv Detail & Related papers (2024-07-15T06:49:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.