TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning
- URL: http://arxiv.org/abs/2504.08694v1
- Date: Fri, 11 Apr 2025 17:02:40 GMT
- Title: TP-RAG: Benchmarking Retrieval-Augmented Large Language Model Agents for Spatiotemporal-Aware Travel Planning
- Authors: Hang Ni, Fan Liu, Xinyu Ma, Lixin Su, Shuaiqiang Wang, Dawei Yin, Hui Xiong, Hao Liu,
- Abstract summary: This paper introduces TP-RAG, the first benchmark tailored retrieval-augmentedtemporalRAG-aware travel planning.<n>Our dataset includes 2,348 real-world travel queries, 85,575 fine-grain POIs, 18,784 annotated POIs.
- Score: 39.934634038758404
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have shown promise in automating travel planning, yet they often fall short in addressing nuanced spatiotemporal rationality. While existing benchmarks focus on basic plan validity, they neglect critical aspects such as route efficiency, POI appeal, and real-time adaptability. This paper introduces TP-RAG, the first benchmark tailored for retrieval-augmented, spatiotemporal-aware travel planning. Our dataset includes 2,348 real-world travel queries, 85,575 fine-grain annotated POIs, and 18,784 high-quality travel trajectory references sourced from online tourist documents, enabling dynamic and context-aware planning. Through extensive experiments, we reveal that integrating reference trajectories significantly improves spatial efficiency and POI rationality of the travel plan, while challenges persist in universality and robustness due to conflicting references and noisy data. To address these issues, we propose EvoRAG, an evolutionary framework that potently synergizes diverse retrieved trajectories with LLMs' intrinsic reasoning. EvoRAG achieves state-of-the-art performance, improving spatiotemporal compliance and reducing commonsense violation compared to ground-up and retrieval-augmented baselines. Our work underscores the potential of hybridizing Web knowledge with LLM-driven optimization, paving the way for more reliable and adaptive travel planning agents.
Related papers
- SynthTRIPs: A Knowledge-Grounded Framework for Benchmark Query Generation for Personalized Tourism Recommenders [6.910185679055651]
This paper introduces a novel SynthTRIPs framework for generating synthetic travel queries using Large Language Models (LLMs)
Our approach combines persona-based preferences (e.g., budget, travel style) with explicit sustainability filters (e.g., walkability, air quality) to produce realistic and diverse queries.
While our framework was developed and tested for personalized city trip recommendations, the methodology applies to other recommender system domains.
arXiv Detail & Related papers (2025-04-12T16:48:35Z) - TripCraft: A Benchmark for Spatio-Temporally Fine Grained Travel Planning [7.841787597078323]
TripCraft establishes a new benchmark for LLM driven personalized travel planning, offering a more realistic, constraint aware framework for itinerary generation.<n>Our parameter informed setting significantly enhances meal scheduling, improving the Temporal Meal Score from 61% to 80% in a 7 day scenario.
arXiv Detail & Related papers (2025-02-27T20:33:28Z) - A Real-time Spatio-Temporal Trajectory Planner for Autonomous Vehicles with Semantic Graph Optimization [8.221371036055167]
We propose a semantic-temporal trajectory planning method based on graph optimization.<n>It can effectively handle complex urban public road scenarios and perform in real time.<n>We will release our codes to accommodate benchmarking for the research community.
arXiv Detail & Related papers (2025-02-25T12:27:06Z) - DeepRAG: Thinking to Retrieval Step by Step for Large Language Models [92.87532210660456]
We propose DeepRAG, a framework that models retrieval-augmented reasoning as a Markov Decision Process (MDP)<n>By iteratively decomposing queries, DeepRAG dynamically determines whether to retrieve external knowledge or rely on parametric reasoning at each step.<n> Experiments show that DeepRAG improves retrieval efficiency while improving answer accuracy by 21.99%, demonstrating its effectiveness in optimizing retrieval-augmented reasoning.
arXiv Detail & Related papers (2025-02-03T08:22:45Z) - ChinaTravel: A Real-World Benchmark for Language Agents in Chinese Travel Planning [50.7898120693695]
We introduce ChinaTravel, a benchmark specifically designed for authentic Chinese travel planning scenarios.<n>We collect the travel requirements from questionnaires and propose a compositionally generalizable domain-specific language.<n> Empirical studies reveal the potential of neuro-symbolic agents in travel planning, achieving a constraint satisfaction rate of 27.9%.<n>We identify key challenges in real-world travel planning deployments, including open language reasoning and unseen concept composition.
arXiv Detail & Related papers (2024-12-18T10:10:12Z) - P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task [94.08478298711789]
Embodied Everyday Task is a popular task in the embodied AI community.
Natural language instructions often lack explicit task planning.
Extensive training is required to equip models with knowledge of the task environment.
arXiv Detail & Related papers (2024-09-17T15:29:34Z) - LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning [91.95362946266577]
Path planning is a fundamental scientific problem in robotics and autonomous navigation.<n>Traditional algorithms like A* and its variants are capable of ensuring path validity but suffer from significant computational and memory inefficiencies as the state space grows.<n>We propose a new LLM based route planning method that synergistically combines the precise pathfinding capabilities of A* with the global reasoning capability of LLMs.<n>This hybrid approach aims to enhance pathfinding efficiency in terms of time and space complexity while maintaining the integrity of path validity, especially in large-scale scenarios.
arXiv Detail & Related papers (2024-06-20T01:24:30Z) - FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model [76.509837704596]
We propose FASTopic, a fast, adaptive, stable, and transferable topic model.
We use Dual Semantic-relation Reconstruction (DSR) to model latent topics.
We also propose Embedding Transport Plan (ETP) to regularize semantic relations as optimal transport plans.
arXiv Detail & Related papers (2024-05-28T09:06:38Z) - LLM-SAP: Large Language Models Situational Awareness Based Planning [0.0]
We employ a multi-agent reasoning framework to develop a methodology that anticipates and actively mitigates potential risks.
Our approach diverges from traditional automata theory by incorporating the complexity of human-centric interactions into the planning process.
arXiv Detail & Related papers (2023-12-26T17:19:09Z) - Route to Time and Time to Route: Travel Time Estimation from Sparse
Trajectories [7.602975042011819]
This paper aims to resolve the problem of travel time estimation (TTE) and route recovery in sparse scenarios.
We formulate this problem as an inexact supervision problem in which the training data has coarsely grained labels.
We propose an EM algorithm to alternatively estimate the travel time of inferred route through weak supervision in E step and retrieve the route based on estimated travel time in M step for sparse trajectories.
arXiv Detail & Related papers (2022-06-21T14:16:58Z) - Adversarial Neural Trip Recommendation [35.70265509185104]
We propose an Adversarial Neural Trip Recommendation framework to tackle the above challenges.
First of all, we devise a novel attention-based encoder-decoder trip generator that can learn the correlations among POIs.
Another novelty of ANT relies on an adversarial learning strategy integrating with reinforcement learning to guide the trip generator to produce high-quality trips.
arXiv Detail & Related papers (2021-09-24T03:57:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.