Fugu-MT 論文翻訳(概要): RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

論文の概要: RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

arxiv url: http://arxiv.org/abs/2603.09723v1
Date: Tue, 10 Mar 2026 14:30:55 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-11 15:25:24.395859
Title: RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation
Title（参考訳）: RbtAct:アクション可能なフィードバック生成のためのスーパービジョンとしての反論
Authors: Sihong Wu, Yiling Ma, Yilun Zhao, Tiansheng Hu, Owen Jiang, Manasi Patwardhan, Arman Cohan,
Abstract要約: 多くのAI生成レビューは表面的で不十分な実行可能であり、著者は具体的で実装可能なガイダンスを残さず、この作業が抱えるギャップを動機付けている。本稿では,行動可能なレビューフィードバック生成を目的としたRbtActを提案し,既存のピアレビューを学習の中心に配置する。
参考スコア（独自算出の注目度）: 47.274230235946625
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are increasingly used across the scientific workflow, including to draft peer-review reports. However, many AI-generated reviews are superficial and insufficiently actionable, leaving authors without concrete, implementable guidance and motivating the gap this work addresses. We propose RbtAct, which targets actionable review feedback generation and places existing peer review rebuttal at the center of learning. Rebuttals show which reviewer comments led to concrete revisions or specific plans, and which were only defended. Building on this insight, we leverage rebuttal as implicit supervision to directly optimize a feedback generator for actionability. To support this objective, we propose a new task called perspective-conditioned segment-level review feedback generation, in which the model is required to produce a single focused comment based on the complete paper and a specified perspective such as experiments and writing. We also build a large dataset named RMR-75K that maps review segments to the rebuttal segments that address them, with perspective labels and impact categories that order author uptake. We then train the Llama-3.1-8B-Instruct model with supervised fine-tuning on review segments followed by preference optimization using rebuttal derived pairs. Experiments with human experts and LLM-as-a-judge show consistent gains in actionability and specificity over strong baselines while maintaining grounding and relevance.
Abstract（参考訳）: 大規模言語モデル(LLM)は、ピアレビューレポートのドラフトを含め、科学的なワークフロー全体にわたってますます利用されている。しかし、多くのAI生成レビューは表面的で不十分な実行が可能であり、著者は具体的で実装可能なガイダンスを残さず、この作業が抱えるギャップを動機付けている。本稿では,行動可能なレビューフィードバック生成を目的としたRbtActを提案し,既存のピアレビューを学習の中心に配置する。反論は、どのレビュアーのコメントが具体的な修正や具体的な計画に繋がったかを示し、それらが守られただけであった。この知見に基づいて、我々は反感を暗黙の監督として活用し、フィードバックジェネレータを直接最適化する。この目的を達成するために,本モデルでは,完全な論文と実験や執筆などの特定の視点に基づいて,単一のコメントを集中的に作成する必要がある,視点条件付きセグメントレベルのレビューフィードバック生成と呼ばれる新しいタスクを提案する。 RMR-75Kという名前の大規模なデータセットも構築しています。レビューセグメントを、それらに対処するrebuttalセグメントにマッピングし、視点ラベルと、著者の獲得を順序付けるインパクトカテゴリを作成します。次に,Llama-3.1-8B-インストラクタモデルを,レビューセグメントの教師付き微調整で学習し,rebuttal derived pairs を用いて好みの最適化を行う。人間の専門家とLLM-as-a-judgeによる実験は、接地と関連性を維持しながら、強いベースラインよりも作用性と特異性が一貫して向上していることを示している。

論文の概要: RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

関連論文リスト