Fugu-MT 論文翻訳(概要): Summarization is Not Dead Yet

論文の概要: Summarization is Not Dead Yet

arxiv url: http://arxiv.org/abs/2606.08000v1
Date: Sat, 06 Jun 2026 06:38:35 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-09 14:42:05.609859
Title: Summarization is Not Dead Yet
Title（参考訳）: 要約はまだ終わっていない
Authors: Dongqi Liu, Chenxi Whitehouse, Zheng Zhao, Zhuchen Cao, Jian Li, Yabiao Wang,
Abstract要約: 大規模言語モデル(LLM)の進歩は、モデル生成サマリーが人間による参照に匹敵する、あるいは超えているという主張に拍車をかけた。 5つの多様なデータセットと5つの最先端LCMをカバーするマルチトラック評価を通じて、この物語を再検討する。以上の結果から,人間の参照要約が情報的・忠実性の優位性を示し続ける,より曖昧な風景が明らかとなった。
参考スコア（独自算出の注目度）: 28.302567995407532
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The progress of large language models (LLMs) has fueled claims that model-generated summaries rival or even surpass human-written references, raising questions about whether summarization remains an open research problem. We re-examine this narrative through a multi-track evaluation covering five diverse datasets and five state-of-the-art LLMs, combining controlled human assessment, bias-mitigated LLM-as-Judge protocols, factuality verification against external knowledge, and corpus-level linguistic analysis. Our findings reveal a more nuanced landscape in which human reference summaries continue to demonstrate advantages in informativeness and faithfulness, whereas LLM outputs are preferred mainly for surface-level coherence and fluency. Factuality verification indicates that human references remain more reliable, particularly for claims involving reasoning or synthesis, and linguistic analysis uncovers a pattern of stylistic homogeneity across different models. These observations suggest that current LLMs have raised the floor of summarization quality, but the ceiling of their performance remains below human capabilities.
Abstract（参考訳）: 大規模言語モデル(LLMs)の進歩は、モデル生成の要約が人間による参照に匹敵する、あるいは超えるという主張を加速させ、要約がオープンな研究問題のままであるかどうかについての疑問を提起している。我々は,5つの多様なデータセットと5つの最先端LCMを網羅した多トラック評価,制御された人的評価,バイアス緩和LDM-as-Judgeプロトコル,外部知識に対する事実性検証,コーパスレベルの言語分析を組み合わせることで,この物語を再検討する。以上の結果から,LLMの出力は主に表面のコヒーレンスやフラエンシに好まれるが,人間の参照要約は情報的・忠実性の優位性を示す傾向にあることが明らかとなった。ファクチュアリティ検証(英語版)は、人間の参照が、特に推論や合成に関わる主張に対してより信頼できるままであることを示し、言語学的分析は、異なるモデルにまたがるスタイリスティックな同質性のパターンを明らかにする。これらの観察から,現在のLCMは要約品質のフロアを上昇させたが,その性能の天井は人間の能力より劣っていることが示唆された。

論文の概要: Summarization is Not Dead Yet

関連論文リスト