Plan-MCTS: Plan Exploration for Action Exploitation in Web Navigation
- URL: http://arxiv.org/abs/2602.14083v1
- Date: Sun, 15 Feb 2026 10:24:45 GMT
- Title: Plan-MCTS: Plan Exploration for Action Exploitation in Web Navigation
- Authors: Weiming Zhang, Jihong Wang, Jiamu Zhou, Qingyao Li, Xinbei Ma, Congmin Zheng, Xingyu Lou, Weiwen Liu, Zhuosheng Zhang, Jun Wang, Yong Yu, Weinan Zhang,
- Abstract summary: Plan-MCTS is a framework that reformulates web navigation by shifting exploration to a semantic Plan Space.<n>Plan-MCTS achieves state-of-the-art performance, surpassing current approaches with higher task effectiveness and search efficiency.
- Score: 50.406803870992974
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have empowered autonomous agents to handle complex web navigation tasks. While recent studies integrate tree search to enhance long-horizon reasoning, applying these algorithms in web navigation faces two critical challenges: sparse valid paths that lead to inefficient exploration, and a noisy context that dilutes accurate state perception. To address this, we introduce Plan-MCTS, a framework that reformulates web navigation by shifting exploration to a semantic Plan Space. By decoupling strategic planning from execution grounding, it transforms sparse action space into a Dense Plan Tree for efficient exploration, and distills noisy contexts into an Abstracted Semantic History for precise state awareness. To ensure efficiency and robustness, Plan-MCTS incorporates a Dual-Gating Reward to strictly validate both physical executability and strategic alignment and Structural Refinement for on-policy repair of failed subplans. Extensive experiments on WebArena demonstrate that Plan-MCTS achieves state-of-the-art performance, surpassing current approaches with higher task effectiveness and search efficiency.
Related papers
- Branch-and-Browse: Efficient and Controllable Web Exploration with Tree-Structured Reasoning and Action Memory [69.49061918994882]
Branch-and-Browse is a fine-grained web agent framework that unifies structured reasoning-acting, contextual memory, and efficient execution.<n>On the WebArena benchmark, Branch-and-Browse achieves a task success rate of 35.8% and reduces execution time by up to 40.4% relative to state-of-the-art methods.
arXiv Detail & Related papers (2025-10-18T00:45:37Z) - DeepPlanner: Scaling Planning Capability for Deep Research Agents via Advantage Shaping [74.34061104176554]
We propose DeepPlanner, an end-to-end RL framework that effectively enhances the planning capabilities of deep research agents.<n>Our approach shapes token-level advantage with an entropy-based term to allocate larger updates to high entropy tokens, and selectively upweights sample-level advantages for planning-intensive rollouts.
arXiv Detail & Related papers (2025-10-14T20:47:05Z) - HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking [109.09735490692202]
We propose HyperTree Planning (HTP), a novel reasoning paradigm that constructs hypertree-structured planning outlines for effective planning.<n> Experiments demonstrate the effectiveness of HTP, achieving state-of-the-art accuracy on the TravelPlanner benchmark with Gemini-1.5-Pro, resulting in a 3.6 times performance improvement over o1-preview.
arXiv Detail & Related papers (2025-05-05T02:38:58Z) - AI2STOW: End-to-End Deep Reinforcement Learning to Construct Master Stowage Plans under Demand Uncertainty [0.0]
This article proposes AI2STOW, an end-to-end deep reinforcement learning model with feasibility projection and an action mask to create master plans under demand uncertainty.<n>Our experimental results demonstrate that AI2STOW outperforms baseline methods from reinforcement learning and programming in objective performance and computational efficiency.
arXiv Detail & Related papers (2025-04-06T12:45:25Z) - Adaptive Interactive Navigation of Quadruped Robots using Large Language Models [14.14967096139099]
We present a primitive tree for task planning with large language models (LLMs)<n>We adopt reinforcement learning to pre-train a comprehensive skill library containing versatile locomotion and interaction behaviors for motion planning.<n> integrated with the tree structure, the replanning mechanism allows for convenient node addition and pruning.
arXiv Detail & Related papers (2025-03-29T02:17:52Z) - SCoTT: Strategic Chain-of-Thought Tasking for Wireless-Aware Robot Navigation in Digital Twins [78.53885607559958]
We propose SCoTT, a wireless-aware path planning framework.<n>We show that SCoTT achieves path gains within 2% of DP-WA* while consistently generating shorter trajectories.<n>We also show the practical viability of our approach by deploying SCoTT as a ROS node within Gazebo simulations.
arXiv Detail & Related papers (2024-11-27T10:45:49Z) - WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration [42.8636989730348]
Existing LLM-based web agents rely on rigid, expert-designed policies specific to certain states and actions.
Humans excel by exploring unknowns, continuously adapting strategies, and resolving ambiguities through exploration.
We develop WebPilot, a multi-agent system with a dual optimization strategy that improves Monte Carlo Tree Search (MCTS) to better handle complex web environments.
arXiv Detail & Related papers (2024-08-28T17:49:29Z) - Diffusion-Reinforcement Learning Hierarchical Motion Planning in Multi-agent Adversarial Games [6.532258098619471]
We propose a hierarchical architecture that integrates a high-level diffusion model to plan global paths responsive to environment data.<n>We show that our approach outperforms baselines by 77.18% and 47.38% on detection and goal reaching rate.
arXiv Detail & Related papers (2024-03-16T03:53:55Z) - Path Planning based on 2D Object Bounding-box [8.082514573754954]
We present a path planning method that utilizes 2D bounding boxes of objects, developed through imitation learning in urban driving scenarios.
This is achieved by integrating high-definition (HD) map data with images captured by surrounding cameras.
We evaluate our model on the nuPlan planning task and observed that it performs competitively in comparison to existing vision-centric methods.
arXiv Detail & Related papers (2024-02-22T19:34:56Z) - Structurally guided task decomposition in spatial navigation tasks [7.21356271882087]
We extend an existing model of human task decomposition to explain a wide range of simple planning problems.
Our results suggest that our framework can correctly predict the navigation strategies of the majority of the participants in an online experiment.
arXiv Detail & Related papers (2023-10-03T17:27:30Z) - Latent Space Roadmap for Visual Action Planning of Deformable and Rigid
Object Manipulation [74.88956115580388]
Planning is performed in a low-dimensional latent state space that embeds images.
Our framework consists of two main components: a Visual Foresight Module (VFM) that generates a visual plan as a sequence of images, and an Action Proposal Network (APN) that predicts the actions between them.
arXiv Detail & Related papers (2020-03-19T18:43:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.