Fugu-MT 論文翻訳(概要): ArgBench: Benchmarking LLMs on Computational Argumentation Tasks

論文の概要: ArgBench: Benchmarking LLMs on Computational Argumentation Tasks

arxiv url: http://arxiv.org/abs/2604.17366v1
Date: Sun, 19 Apr 2026 10:23:41 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 21:52:52.489008
Title: ArgBench: Benchmarking LLMs on Computational Argumentation Tasks
Title（参考訳）: ArgBench: 計算引数タスクに関するLLMのベンチマーク
Authors: Yamen Ajjour, Carlotta Quensel, Nedim Lipka, Henning Wachsmuth,
Abstract要約: 大言語モデル(LLMs)にとって、議論スキルは必須のツールキットである計算議論に対する LLM ベースのアプローチの標準化評価のための最初のベンチマークを作成する。
参考スコア（独自算出の注目度）: 25.924152913253902
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Argumentation skills are an essential toolkit for large language models (LLMs). These skills are crucial in various use cases, including self-reflection, debating collaboratively for diverse answers, and countering hate speech. In this paper, we create the first benchmark for a standardized evaluation of LLM-based approaches to computational argumentation, encompassing 33 datasets from previous work in unified form. Using the benchmark, we evaluate the generalizability of five LLM families across 46 computational argumentation tasks that cover mining arguments, assessing perspectives, assessing argument quality, reasoning about arguments, and generating arguments. On the benchmark, we conduct an extensive systematic analysis of the contribution of few-shot examples, reasoning steps, model size, and training skills to the performance of LLMs on the computational argumentation tasks in the benchmark.
Abstract（参考訳）: argumentation skillsは、大規模言語モデル(LLM)に不可欠なツールキットである。これらのスキルは、自己回帰、多様な回答を共同で議論すること、ヘイトスピーチに対抗することなど、さまざまなユースケースにおいて不可欠である。本稿では,従来の研究から得られた33個のデータセットを統一形式で含む,計算議論へのLCMベースのアプローチの標準化評価のための最初のベンチマークを作成する。このベンチマークを用いて、46の計算議論タスクにまたがる5つのLLMファミリーの一般化可能性を評価し、マイニングの議論をカバーし、視点を評価し、議論の質を評価し、議論について推論し、議論を生成する。ベンチマークでは,数ショットの例,推論ステップ,モデルサイズ,トレーニングスキルが,ベンチマークにおける計算議論タスクにおけるLLMのパフォーマンスに与える影響を,広範囲に体系的に分析する。

論文の概要: ArgBench: Benchmarking LLMs on Computational Argumentation Tasks

関連論文リスト