Related papers: TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models

TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models

URL: http://arxiv.org/abs/2602.14089v1
Date: Sun, 15 Feb 2026 10:39:43 GMT
Title: TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models
Authors: Zhizhao Luo, Zhaojing Luo, Meihui Zhang, Rui Mao,
Abstract summary: TabTracer is an agentic framework that coordinates multi-step tool calls over intermediate table states.<n>It enforces step-level verification with typed operations and lightweight numeric and format checks.<n>It reduces redundancy with budget-aware pruning, deduplication, and state hashing with a monotonicity gate to cut token cost.
Score: 10.584052101655537
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) have emerged as powerful tools for natural language table reasoning, where there are two main categories of methods. Prompt-based approaches rely on language-only inference or one-pass program generation without step-level verification. Agent-based approaches use tools in a closed loop, but verification is often local and backtracking is limited, allowing errors to propagate and increasing cost. Moreover, they rely on chain- or beam-style trajectories that are typically combinatorially redundant, leading to high token costs. In this paper, we propose TabTracer, an agentic framework that coordinates multi-step tool calls over intermediate table states, with explicit state tracking for verification and rollback. First, it enforces step-level verification with typed operations and lightweight numeric and format checks to provide reliable rewards and suppress hallucinations. Second, execution-feedback Monte Carlo Tree Search maintains a search tree of candidate table states and uses backpropagated reflection scores to guide UCB1 selection and rollback via versioned snapshots. Third, it reduces redundancy with budget-aware pruning, deduplication, and state hashing with a monotonicity gate to cut token cost. Comprehensive evaluation on TabFact, WikiTQ, and CRT datasets shows that TabTracer outperforms state-of-the-art baselines by up to 6.7% in accuracy while reducing token consumption by 59--84%.

Related papers

TabAgent: A Framework for Replacing Agentic Generative Components with Tabular-Textual Classifiers [5.792704492773729]
TabAgent is a framework for replacing generative decision components in closed-set selection tasks with a compact textual-tabular classifier trained on execution traces.<n>On the long-horizon AppWorld benchmark, TabAgent maintains task-level success while eliminating shortlist-time LLM calls, reducing latency by approximately 95% and inference cost by 85-91%.
arXiv Detail & Related papers (2026-02-18T13:01:17Z)
Col-Bandit: Zero-Shot Query-Time Pruning for Late-Interaction Retrieval [2.159285655678094]
Col-Bandit is a query-time pruning algorithm that reduces this computational burden by casting reranking as a finite-population Top-$K$ identification problem.<n>Unlike coarse-grained approaches that prune entire documents or tokens offline, Col-Bandit sparsifies the interaction matrix on the fly.<n>Experiments show that Col-Bandit preserves ranking fidelity while reducing MaxSim FLOPs by up to 5$times$.
arXiv Detail & Related papers (2026-02-02T21:27:01Z)
Reasoning by Commented Code for Table Question Answering [2.497926557563177]
Table Question Answering (TableQA) poses a significant challenge for large language models.<n>Existing methods, which depend on end-to-end answer generation or single-line program queries, exhibit limited numerical accuracy and reduced interpretability.<n>This work introduces a commented, step-by-step code-generation framework that incorporates explicit reasoning into the Python program-generation process.
arXiv Detail & Related papers (2026-01-31T06:16:35Z)
TreePS-RAG: Tree-based Process Supervision for Reinforcement Learning in Agentic RAG [71.06073770344732]
Agentic retrieval-augmented generation (RAG) formulates question answering as a multi-step interaction between reasoning and information retrieval.<n>We present TreePS-RAG, an online, tree-based RL framework for agentic RAG that enables step-wise credit assignment while retaining outcome-only rewards.
arXiv Detail & Related papers (2026-01-11T14:07:30Z)
Rethinking Table Pruning in TableQA: From Sequential Revisions to Gold Trajectory-Supervised Parallel Search [22.58777921256103]
Table Question Answering (TableQA) benefits significantly from table pruning.<n>Existing table pruning methods rely on sequential revisions driven by unreliable critique signals.<n>We propose TabTrim, a novel table pruning framework which transforms table pruning from sequential revisions to gold trajectory-supervised parallel search.
arXiv Detail & Related papers (2026-01-07T12:08:59Z)
TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework [62.66056331998838]
TeaRAG is a token-efficient agentic RAG framework capable of compressing both retrieval content and reasoning steps.<n>Our reward function evaluates the knowledge sufficiency by a knowledge matching mechanism, while penalizing excessive reasoning steps.
arXiv Detail & Related papers (2025-11-07T16:08:34Z)
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular Reasoning [77.01182934427095]
TaTToo is a novel table-grounded PRM framework that integrates tool-based verification to provide precise reward supervision.<n>We train TaTToo with a dual-stage paradigm: cold-start supervised fine-tuning to capture tool-use reasoning patterns, followed by reinforcement learning to align our model with table-based verification.
arXiv Detail & Related papers (2025-10-07T17:59:41Z)
GraphRunner: A Multi-Stage Framework for Efficient and Accurate Graph-Based Retrieval [3.792463570467098]
GraphRunner is a novel graph-based retrieval framework that operates in three distinct stages: planning, verification, and execution.<n>It significantly reduces reasoning errors and detects hallucinations before execution.<n>Our evaluation using the GRBench dataset shows that GraphRunner consistently outperforms existing approaches.
arXiv Detail & Related papers (2025-07-11T18:10:01Z)
Multimodal Tabular Reasoning with Privileged Structured Information [67.40011423365712]
We introduce TabUlar Reasoning with Bridged infOrmation (sc Turbo)<n>sc Turbo benefits from a structure-aware reasoning trace generator based on DeepSeek-R1.<n>sc Turbo achieves state-of-the-art performance ($+7.2%$ vs. previous SOTA) across multiple datasets.
arXiv Detail & Related papers (2025-06-04T15:46:30Z)
From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval [22.35942074715463]
Chain-of-Thought (CoT) prompting enables complex reasoning in large language models (LLMs)<n>We propose State Machine Reasoning (SMR), a transition-based reasoning framework composed of discrete actions.<n> Experiments on the BEIR and BRIGHT benchmarks show that SMR improves retrieval performance (nDCG@10) by 3.4% while reducing token usage by 74.4%.
arXiv Detail & Related papers (2025-05-29T04:04:25Z)
T^2Agent A Tool-augmented Multimodal Misinformation Detection Agent with Monte Carlo Tree Search [51.91311158085973]
multimodal misinformation often arises from mixed forgery sources, requiring dynamic reasoning and adaptive verification.<n>We propose T2Agent, a novel misinformation detection agent that incorporates a toolkit with Monte Carlo Tree Search.<n>Extensive experiments show that T2Agent consistently outperforms existing baselines on challenging mixed-source multimodal misinformation benchmarks.
arXiv Detail & Related papers (2025-05-26T09:50:55Z)
Multilingual Autoregressive Entity Linking [49.35994386221958]
mGENRE is a sequence-to-sequence system for the Multilingual Entity Linking problem. For a mention in a given language, mGENRE predicts the name of the target entity left-to-right, token-by-token. We show the efficacy of our approach through extensive evaluation including experiments on three popular MEL benchmarks.
arXiv Detail & Related papers (2021-03-23T13:25:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.