Fugu-MT 論文翻訳(概要): STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media

論文の概要: STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media

arxiv url: http://arxiv.org/abs/2605.25162v1
Date: Sun, 24 May 2026 16:44:15 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 19:50:18.925526
Title: STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media
Title（参考訳）: STREAM: ストリーミングメディアから高価値タスク指向対話をマイニングするためのデータ中心フレームワーク
Authors: Liang Xue, Haoyu Liu, Cheng Wang, Pengyu Chen, Haozhuo Zheng, Yang Liu,
Abstract要約: 大規模に高価値なサービス対話を合成するデータ中心のフレームワークであるStreamを提案する。ストリームマイニングはノイズの多いストリームからの真のインタラクション信号を抽出し、ロールグラウンドのペルソナ構造を統合することで会話を合成する。 Streamをベースに、Automotive、Restaurant、Hotelをカバーする大規模なデータセットであるStreamDialをリリースしています。
参考スコア（独自算出の注目度）: 20.15263583458415
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models for vertical domains are bottlenecked by the scarcity of complex, domain-specific task-oriented dialogues. Existing data acquisition pipelines face a persistent trilemma: expert annotation is expensive, real-world service conversations are constrained by privacy and commercial restrictions, and static corpora quickly become temporally stale. We propose Stream, a data-centric framework that leverages publicly available streaming media (live streams and short videos) to synthesize high-value service dialogues at scale. Stream mines authentic interaction signals from noisy streams and synthesizes conversations by integrating role-grounded persona construction with Conversational Blueprint construction; it further adopts retrieval-augmented generation (RAG) to support knowledge-aware responses. Based on Stream, we release StreamDial, a large-scale multi-domain dataset covering Automotive, Restaurant, and Hotel. StreamDial contains 87,498 dialogue sessions and 1,497,320 turns in total, with an average of 17.11 turns per session and a comparable scale across domains. Each session is organized as a structured quadruplet $\langle P_u, P_a, B, H \rangle$ that pairs dialogue history with explicit user/agent personas and a Conversational Blueprint, capturing realistic service behaviors such as requirement mining, constraint conflicts, negotiation, and recovery. Evaluations with automatic judges and downstream tasks show that StreamDial improves intrinsic dialogue quality over strong baselines, and models trained with StreamDial improve Dialogue State Tracking across backbones; we further report a completed human-evaluation set and encouraging multilingual transfer on Qwen3-8B under a controlled training budget. The data is released in https://github.com/hitxueliang/DialogDataSetBySTREAM.
Abstract（参考訳）: 垂直ドメインのための大規模言語モデルは、複雑なドメイン固有のタスク指向対話の不足によってボトルネックとなる。エキスパートアノテーションは高価で、現実のサービス会話はプライバシと商業的制約によって制約され、静的コーパスはすぐに時間的に不安定になる。公開ストリーム(ライブストリームとショートビデオ)を活用して,大規模に高価値なサービス対話を合成する,データ中心のフレームワークであるStreamを提案する。ストリームマイニングは,対話型ブループリント構築とロールグラウンドのペルソナ構築を統合することで,ノイズの多いストリームからの真のインタラクション信号を合成する。 Streamをベースに、Automotive、Restaurant、Hotelをカバーする大規模なマルチドメインデータセットであるStreamDialをリリースしました。 StreamDialには87,498の対話セッションと1,497,320のターンがあり、セッションごとの平均17.11回、ドメイン間でのスケールに匹敵する。各セッションは構造化された四つ組の$\langle P_u, P_a, B, H \rangle$として構成され、対話履歴と明示的なユーザ/エージェントのペルソナと会話のブループリントをペアリングし、要求マイニング、制約競合、交渉、リカバリといった現実的なサービス動作をキャプチャする。自動判断とダウンストリームタスクを用いた評価では,StreamDialは強力なベースラインよりも本質的な対話品質を向上し,StreamDialで訓練されたモデルでは,バックボーン間の対話状態追跡が向上し,さらに人的評価セットが完成し,Qwen3-8B上での多言語移動が促進されることが示された。データはhttps://github.com/hitxueliang/DialogDataSetBySTREAMで公開されている。

論文の概要: STREAM: A Data-Centric Framework for Mining High-Value Task-Oriented Dialogues from Streaming Media

関連論文リスト