Fugu-MT 論文翻訳(概要): LeWiDi-2025 at NLPerspectives: The Third Edition of the Learning with Disagreements Shared Task

論文の概要: LeWiDi-2025 at NLPerspectives: The Third Edition of the Learning with Disagreements Shared Task

arxiv url: http://arxiv.org/abs/2510.08460v1
Date: Thu, 09 Oct 2025 17:04:28 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-10 17:54:15.224751
Title: LeWiDi-2025 at NLPerspectives: The Third Edition of the Learning with Disagreements Shared Task
Title（参考訳）: LeWiDi-2025 - NLPerspectives: The Third Edition of the Learning with Disagreements Shared Task
Authors: Elisa Leonardelli, Silvia Casola, Siyao Peng, Giulia Rizzi, Valerio Basile, Elisabetta Fersini, Diego Frassinelli, Hyewon Jang, Maja Pavlovic, Barbara Plank, Massimo Poesio,
Abstract要約: LEWIDIシリーズは、AIモデルのトレーニングと評価に対するこのアプローチを促進するために、学習と認識に関するタスクを共有している。タスクの第3版は、LEWIDIベンチマークをパラフレーズ識別、皮肉検出、皮肉検出、自然言語推論の4つのデータセットに拡張することで、この目標に基づいている。
参考スコア（独自算出の注目度）: 38.500623751317896
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Many researchers have reached the conclusion that AI models should be trained to be aware of the possibility of variation and disagreement in human judgments, and evaluated as per their ability to recognize such variation. The LEWIDI series of shared tasks on Learning With Disagreements was established to promote this approach to training and evaluating AI models, by making suitable datasets more accessible and by developing evaluation methods. The third edition of the task builds on this goal by extending the LEWIDI benchmark to four datasets spanning paraphrase identification, irony detection, sarcasm detection, and natural language inference, with labeling schemes that include not only categorical judgments as in previous editions, but ordinal judgments as well. Another novelty is that we adopt two complementary paradigms to evaluate disagreement-aware systems: the soft-label approach, in which models predict population-level distributions of judgments, and the perspectivist approach, in which models predict the interpretations of individual annotators. Crucially, we moved beyond standard metrics such as cross-entropy, and tested new evaluation metrics for the two paradigms. The task attracted diverse participation, and the results provide insights into the strengths and limitations of methods to modeling variation. Together, these contributions strengthen LEWIDI as a framework and provide new resources, benchmarks, and findings to support the development of disagreement-aware technologies.
Abstract（参考訳）: 多くの研究者は、AIモデルは人間の判断における変化と不一致の可能性を認識し、そのような変化を認識する能力によって評価されるように訓練されるべきである、という結論に達した。 LEWIDIシリーズは、適切なデータセットをよりアクセスしやすくし、評価方法を開発することにより、AIモデルのトレーニングと評価に対するこのアプローチを促進するために、学習と認識に関する共有タスクが確立された。タスクの第3版は、LEWIDIベンチマークを、パラフレーズ識別、皮肉検出、皮肉検出、自然言語推論を対象とする4つのデータセットに拡張することで、この目標に基づいている。また、不一致認識システムを評価するために、2つの相補的パラダイム、すなわち、モデルが判断の集団レベルの分布を予測するソフトラベルアプローチと、モデルが個々のアノテーションの解釈を予測するパースペクティブアプローチを採用しています。重要なことに、私たちはクロスエントロピーのような標準的な指標を超えて、この2つのパラダイムの新たな評価指標をテストしました。このタスクは多様な参加者を惹きつけ、結果はバリエーションをモデル化する手法の長所と短所についての洞察を与える。これらの貢献により、LEWIDIはフレームワークとして強化され、新たなリソース、ベンチマーク、発見が提供され、不一致認識技術の開発を支援する。

論文の概要: LeWiDi-2025 at NLPerspectives: The Third Edition of the Learning with Disagreements Shared Task

関連論文リスト