Fugu-MT 論文翻訳(概要): Bridging the Cold-Start Gap: LLM-Powered Synthetic Data Generation for Natural Language Search at Airbnb

論文の概要: Bridging the Cold-Start Gap: LLM-Powered Synthetic Data Generation for Natural Language Search at Airbnb

arxiv url: http://arxiv.org/abs/2605.21812v1
Date: Wed, 20 May 2026 23:18:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-22 16:35:42.024753
Title: Bridging the Cold-Start Gap: LLM-Powered Synthetic Data Generation for Natural Language Search at Airbnb
Title（参考訳）: コールドスタートギャップのブリッジ:Airbnbにおける自然言語検索のためのLLMによる合成データ生成
Authors: Wendy Ran Wei, Hao Li, Weiwei Guo, Xiaowei Liu, Xueyin Chen, Dillon Davis, Malay Haldar, Soumyadip Banerjee, Kedar Bellare, Huiji Gao, Stephanie Moyerman, Sanjeev Katariya,
Abstract要約: 大規模言語モデル(LLM)を用いて合成クエリとラベルを生成するためのフレームワークを提案する。クエリ生成には、予約セッションからのコントラスト的なリストペアと、ユーザリサーチからのシードクエリを組み合わせて、リアリズムと多様性のバランスを取る。ラベル生成には、構築による話題ラベルを生成するコントラスト生成と、より広範なカバレッジのための仮想ジャッジ(VJ)ラベルを導入する。
参考スコア（独自算出の注目度）: 13.678824332628311
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deploying natural language search systems presents a critical cold-start challenge: no real user queries to learn linguistic patterns, and no relevance labels to train ranking models. We present a framework for generating synthetic queries and labels using large language models (LLMs), powering model training and evaluation for Airbnb's natural language search. For query generation, we combine contrastive listing pairs from booking sessions with seed queries from user research to balance realism and diversity, enabling a cold-to-warm start transition as real user data becomes available. For label generation, we introduce contrastive generation that produces topicality labels by construction, and Virtual Judge (VJ) labeling for broader coverage. We compare our approach against a no-seed contrastive baseline and an InPars-style baseline. For query length, the InPars baseline produces verbose queries with KL divergence of 12.03 vs. real users; our seed-guided approach achieves 0.66, a 7.5x improvement. For attribute type distributions, our approach achieves the lowest KL divergence (0.04), outperforming even seed queries (0.09). Experiments show our approach produces harder evaluation examples than the no-seed baseline (79% vs. 97% pairwise accuracy), providing discriminative signal for model improvement. We deploy production pipelines generating synthetic examples daily for embedding-based retrieval and ranking evaluation.
Abstract（参考訳）: 自然言語検索システムのデプロイは、言語パターンを学習する実際のユーザクエリや、ランキングモデルをトレーニングする関連ラベルなど、重要なコールドスタート課題を提示する。本稿では,大規模言語モデル(LLM)を用いて合成クエリとラベルを生成するためのフレームワークを提案し,Airbnbの自然言語検索のためのモデルトレーニングと評価を行う。クエリ生成では、予約セッションとユーザリサーチからのシードクエリを対比的に組み合わせて、現実性と多様性のバランスをとることで、実際のユーザデータが利用可能になると、コールド・ツー・ウォームの開始移行を可能にする。ラベル生成には、構築による話題ラベルを生成するコントラスト生成と、より広範なカバレッジのための仮想ジャッジ(VJ)ラベルを導入する。ノンシードのコントラストベースラインとInParsスタイルのベースラインとの比較を行った。 InParsベースラインでは、クエリ長に対して、KLの発散が12.03で、実際のユーザに対して冗長なクエリを生成する。属性型分布に対して,本手法は最小のKL発散(0.04)を達成し,シードクエリ(0.09)よりも優れていた。実験により, モデル改良のための識別信号として, 非選別ベースライン (79%対97%対ペア精度) よりも難しい評価例が得られた。組込み型検索とランキング評価のための合成例を毎日生成する生産パイプラインをデプロイする。

論文の概要: Bridging the Cold-Start Gap: LLM-Powered Synthetic Data Generation for Natural Language Search at Airbnb

関連論文リスト