Fugu-MT 論文翻訳(概要): Impact of enriched meaning representations for language generation in dialogue tasks: A comprehensive exploration of the relevance of tasks, corpora and metrics

論文の概要: Impact of enriched meaning representations for language generation in dialogue tasks: A comprehensive exploration of the relevance of tasks, corpora and metrics

arxiv url: http://arxiv.org/abs/2603.29518v1
Date: Tue, 31 Mar 2026 10:03:56 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-01 15:25:03.477058
Title: Impact of enriched meaning representations for language generation in dialogue tasks: A comprehensive exploration of the relevance of tasks, corpora and metrics
Title（参考訳）: 対話課題における言語生成のための豊かな意味表現の影響:タスク・コーパス・メトリクスとの関連性に関する包括的考察
Authors: Alain Vázquez, Maria Inés Torres,
Abstract要約: 本研究では,意味表現がドメイン間の生成品質,コーパス特性,およびこれらの世代を評価するために使用される指標に与える影響を比較検討した。提案したリッチな入力は,MRや文の変動性の高い複雑なタスクや小さなデータセットに有効である。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conversational systems should generate diverse language forms to interact fluently and accurately with users. In this context, Natural Language Generation (NLG) engines convert Meaning Representations (MRs) into sentences, directly influencing user perception. These MRs usually encode the communicative function (e.g., inform, request, confirm) via DAs and enumerate the semantic content with slot-value pairs. In this work, our objective is to analyse whether providing a task demonstrator to the generator enhances the generations of a fine-tuned model. This demonstrator is an MR-sentence pair extracted from the original dataset that enriches the input at training and inference time. The analysis involves five metrics that focus on different linguistic aspects, and four datasets that differ in multiple features, such as domain, size, lexicon, MR variability, and acquisition process. To the best of our knowledge, this is the first study on dialogue NLG implementing a comparative analysis of the impact of MRs on generation quality across domains, corpus characteristics, and the metrics used to evaluate these generations. Our key insight is that the proposed enriched inputs are effective for complex tasks and small datasets with high variability in MRs and sentences. They are also beneficial in zero-shot settings for any domain. Moreover, the analysis of the metrics shows that semantic metrics capture generation quality more accurately than lexical metrics. In addition, among these semantic metrics, those trained with human ratings can detect omissions and other subtle semantic issues that embedding-based metrics often miss. Finally, the evolution of the metric scores and the excellent results for Slot Accuracy and Dialogue Act Accuracy demonstrate that the generative models present fast adaptability to different tasks and robustness at semantic and communicative intention levels.
Abstract（参考訳）: 会話システムは多様な言語形式を生成して,ユーザとの流動的かつ正確な対話を行なわなければならない。この文脈では、自然言語生成(NLG)エンジンが意味表現(MR)を文に変換し、ユーザーの知覚に直接影響を与える。これらのMRは通常、DAを介して通信関数(例えば、情報、要求、確認)を符号化し、スロット値ペアで意味内容を列挙する。本研究の目的は,タスクデモレータをジェネレータに供給することで,微調整モデルの生成が促進されるかどうかを解析することである。このデモレータは、トレーニング時と推論時に入力を豊かにする元のデータセットから抽出されたMR-文対である。この分析には、異なる言語的側面に焦点を当てた5つのメトリクスと、ドメイン、サイズ、レキシコン、MR可変性、取得プロセスなど、複数の特徴が異なる4つのデータセットが含まれる。我々の知る限りでは、MRsがドメイン間の生成品質、コーパス特性、およびこれらの世代を評価するために使用される指標に与える影響を比較分析した初めての対話型NLG研究である。我々の重要な洞察は、提案された豊富な入力は、MRや文に高い可変性を持つ複雑なタスクや小さなデータセットに有効であるということである。また、任意のドメインに対してゼロショット設定でも有益である。さらに,これらの指標の分析から,意味的指標が語彙的指標よりも精度の高い生成品質を捉えていることが示唆された。さらに、これらのセマンティックメトリクスのうち、人間の評価で訓練されたものは、欠落や、埋め込みベースのメトリクスがしばしば見逃すような微妙なセマンティックな問題を検出することができる。最後に、メトリクススコアの進化とスロット精度と対話行為の精度の優れた結果から、生成モデルは異なるタスクに迅速に適応し、意味的およびコミュニケーション的意図レベルで堅牢性を示すことを示した。

論文の概要: Impact of enriched meaning representations for language generation in dialogue tasks: A comprehensive exploration of the relevance of tasks, corpora and metrics

関連論文リスト