Fugu-MT 論文翻訳(概要): TRACE: Tourism Recommendation with Accountable Citation Evidence

論文の概要: TRACE: Tourism Recommendation with Accountable Citation Evidence

arxiv url: http://arxiv.org/abs/2605.07677v1
Date: Fri, 08 May 2026 12:47:39 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:39.049069
Title: TRACE: Tourism Recommendation with Accountable Citation Evidence
Title（参考訳）: TRACE:カウンタブル・サイテーション・エビデンスによる観光勧告
Authors: Zixu Zhao, Sijin Wang, Yu Hou, Yuanyuan Xu, Yufan Sheng, Xike Xie, Wenjie Zhang, Won-Yong Shin, Xin Cao,
Abstract要約: TRACEでは,各項目が複数ターンの観光レコメンデーションダイアログであり,レビュー・スパンの引用と明示的な拒絶のターンがある。米国8都市で2,400のPOIと34,208のレビューで1万の対話があり、14の検索、計画、LLMベースラインが組み合わされた。 LLM Zero-ShotはクローズドセットのRecall@1とリジェクションリカバリをリードするが、レトリバーよりも密度が低い。
参考スコア（独自算出の注目度）: 29.826237475668805
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Tourism is a high-stakes setting for conversational recommender systems (CRS): a plausible-sounding suggestion can waste real money and trip time once a traveler acts on it. Existing CRS benchmarks primarily evaluate systems with a single Recall@k score over entity mentions, and tourism-specific resources add spatial or knowledge-graph context, yet none of them couple multi-turn recommendation with verbatim review-span evidence and rejection recovery. This leaves an evaluation gap for tourism recommendation that is simultaneously trustworthy, verifiable, and adaptive: recommend the right point of interest (POI) for multi-aspect preferences (such as cuisine, price, atmosphere, walking distance), justify each suggestion with verifiable evidence from prior visitors so the traveler can act without trial and error, and recover when the first recommendation is rejected mid-dialogue. We introduce TRACE, where each item is a multi-turn tourism recommendation dialogue with review-span citations and explicit rejection turns: 10,000 dialogues over 2,400 Yelp POIs and 34,208 reviews across eight U.S. cities, paired with 14 retrieval, planning, and LLM baselines, along with 25 metrics organized under Accuracy, Grounding, and Recovery. Across these baselines, TRACE reveals the Three-Competency Gap: LLM Zero-Shot leads in closed-set Recall@1 and rejection recovery but cites less densely than retrievers; non-LLM retrievers achieve surface-verbatim grounding but with low accuracy; Multi-Review Synthesis fails at recovery. The Grounding Score agrees with human citation precision (Spearman rho=+0.80, p<10^-20), and paired t-tests reproduce the per-baseline ranking (p<0.01 on the dominant contrasts). TRACE reframes accountable tourism recommendation as a joint target (right POI, verifiable evidence, adaptive repair) rather than a single-axis leaderboard.
Abstract（参考訳）: 観光は会話レコメンデーションシステム(CRS)の高精細な設定であり、旅行者が行動したときの実際のお金と旅行時間を浪費することができる。既存のCRSベンチマークは、エンティティの言及よりも単一のRecall@kスコアを持つシステムを評価し、観光特化リソースは空間的または知識グラフのコンテキストを追加するが、いずれも冗長なレビュースパンエビデンスと拒否リカバリを備えたマルチターンレコメンデーションを組み合わせない。これは、多面的な嗜好(料理、価格、雰囲気、歩行距離など)に対する適切な関心点(POI)を推薦し、事前のビジターからの検証済みの証拠で各提案を正当化し、旅行者が試行錯誤なしに行動できるようにし、第一のレコメンデーションが中間対話で拒否されたときに回復する、という、同時に信頼できる、検証可能な、適応可能な観光レコメンデーションに対する評価ギャップを残している。 1万件のYelp POIと34,208件のレビューを米国8都市で実施し、14件の検索、計画、LCMベースラインと組み合わせ、精度、グラウンディング、リカバリに基づいて25件のメトリクスを収集した。 LLM Zero-ShotはクローズドセットのRecall@1とリジェクションリカバリをリードするが、レトリバーよりも密度が低い。グラウンドリングスコアは、ヒトの引用精度(Spearman rho=+0.80, p<10^-20)に一致し、ペアのt検定では、基準線当たりのランク(p<0.01)を再現する。 TRACEは、単一軸のリーダーボードではなく、共同ターゲット(右POI、検証済みの証拠、適応的な修復)として、説明可能な観光勧告を見直している。

論文の概要: TRACE: Tourism Recommendation with Accountable Citation Evidence

関連論文リスト