Fugu-MT 論文翻訳(概要): PredictionMarketBench: A SWE-bench-Style Framework for Backtesting Trading Agents on Prediction Markets

論文の概要: PredictionMarketBench: A SWE-bench-Style Framework for Backtesting Trading Agents on Prediction Markets

arxiv url: http://arxiv.org/abs/2602.00133v1
Date: Wed, 28 Jan 2026 06:41:12 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-03 19:28:32.968736
Title: PredictionMarketBench: A SWE-bench-Style Framework for Backtesting Trading Agents on Prediction Markets
Title（参考訳）: PredictionMarketBench: 予測市場におけるトレーディングエージェントのバックテストのためのSWEベンチスタイルフレームワーク
Authors: Avi Arora, Ritesh Malpani,
Abstract要約: PredictionMarketBenchは、予測市場におけるアルゴリズムおよびLLMベースのトレーディングエージェントの評価のためのSWEベンチスタイルのベンチマークである。 PredictionMarketBenchは、(i)生の交換ストリーム(注文帳、取引、ライフサイクル、決済)からのエピソード構築を標準化し、(ii)メーカー/テッカーセマンティクスと料金モデリングを備えた実行現実的なシミュレータ、(iii)ツールベースのエージェントインターフェースを標準化する。暗号通貨、天気、スポーツにまたがるカルシをベースとした4つのエピソードを公表する。ベースラインの結果は、取引コストや決済損失により、ナイーブなトレーディングエージェントが過小評価され、一方、手数料を意識したアルゴリズム戦略は、不安定なエピソードにおいて競争力を維持していることを示している。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prediction markets offer a natural testbed for trading agents: contracts have binary payoffs, prices can be interpreted as probabilities, and realized performance depends critically on market microstructure, fees, and settlement risk. We introduce PredictionMarketBench, a SWE-bench-style benchmark for evaluating algorithmic and LLM-based trading agents on prediction markets via deterministic, event-driven replay of historical limit-order-book and trade data. PredictionMarketBench standardizes (i) episode construction from raw exchange streams (orderbooks, trades, lifecycle, settlement), (ii) an execution-realistic simulator with maker/taker semantics and fee modeling, and (iii) a tool-based agent interface that supports both classical strategies and tool-calling LLM agents with reproducible trajectories. We release four Kalshi-based episodes spanning cryptocurrency, weather, and sports. Baseline results show that naive trading agents can underperform due to transaction costs and settlement losses, while fee-aware algorithmic strategies remain competitive in volatile episodes.
Abstract（参考訳）: 予測市場は、トレーディングエージェントに自然なテストベッドを提供する:契約にはバイナリペイオフがあり、価格は確率として解釈でき、パフォーマンスは市場のミクロ構造、手数料、決済リスクに大きく依存する。我々は,予測市場におけるアルゴリズムおよびLLMに基づくトレーディングエージェントの評価を行う,SWEベンチマークであるPredictMarketBenchを紹介する。 PredictionMarketBenchが標準化一生の取引所(注文帳、取引所、ライフサイクル、決済所)からのエピソード構築 (二)メーカー/テカーのセマンティクスと料金モデリングを備えた実行現実シミュレータ、及び三古典的戦略と再現可能な軌跡を持つLLMエージェントの両方をサポートするツールベースのエージェントインタフェース。暗号通貨、天気、スポーツにまたがる4つのカルシベースのエピソードをリリースする。ベースラインの結果は、取引コストと決済損失により、素直なトレーディングエージェントはパフォーマンスが劣り、一方、料金を意識したアルゴリズム戦略は、不安定なエピソードにおいて競争力を維持していることを示している。

論文の概要: PredictionMarketBench: A SWE-bench-Style Framework for Backtesting Trading Agents on Prediction Markets

関連論文リスト