Fugu-MT 論文翻訳(概要): Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

論文の概要: Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

arxiv url: http://arxiv.org/abs/2510.01932v2
Date: Sat, 04 Oct 2025 07:24:46 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-07 12:09:05.153228
Title: Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning
Title（参考訳）: Veri-R1:オンライン強化学習による正確かつ忠実なクレーム検証に向けて
Authors: Qi He, Cheng Qian, Xiusi Chen, Bingxiang He, Yi R. Fung, Heng Ji,
Abstract要約: 大規模言語モデル(LLM)によるクレーム検証は、その強力な推論能力と透過的な検証プロセスのため、近年注目を集めている。我々は、LLMが検索エンジンと対話し、その計画、検索、推論行動を明確に形作る報酬信号を受け取ることができるオンライン強化学習フレームワークであるVeri-R1を紹介した。実験の結果、Veri-R1は最大30%の精度で関節の精度を向上し、エビデンススコアを2倍にし、より大きなモデルを上回ることが示されている。
参考スコア（独自算出の注目度）: 53.05161493434908
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Claim verification with large language models (LLMs) has recently attracted growing attention, due to their strong reasoning capabilities and transparent verification processes compared to traditional answer-only judgments. However, existing approaches to online claim verification, which requires iterative evidence retrieval and reasoning, still mainly rely on prompt engineering or pre-designed reasoning workflows, without unified training to improve necessary skills. Therefore, we introduce Veri-R1, an online reinforcement learning (RL) framework that enables an LLM to interact with a search engine and to receive reward signals that explicitly shape its planning, retrieval, and reasoning behaviors. This dynamic interaction of LLM with retrieval systems more accurately reflects real-world verification scenarios and fosters comprehensive verification skills. Empirical results show that Veri-R1 improves joint accuracy by up to 30% and doubles the evidence score, often surpassing its larger-scale model counterparts. Ablation studies further reveal the impact of reward components, and the link between output logits and label accuracy. Our results highlight the effectiveness of online RL for precise and faithful claim verification, providing an important foundation for future research. We release our code to support community progress in LLM empowered claim verification.
Abstract（参考訳）: 大規模言語モデル(LLM)によるクレーム検証は,従来の回答のみの判断と比較して,強い推論能力と透過的な検証プロセスのため,近年注目を集めている。しかし、反復的証拠検索と推論を必要とするオンラインクレーム検証への既存のアプローチは、必要なスキルを改善するための統一的なトレーニングを伴わず、プロンプトエンジニアリングや事前設計の推論ワークフローに依存している。そこで我々は,LLMが検索エンジンと対話し,その計画,検索,推論行動を明確に形作る報酬信号を受け取ることができるオンライン強化学習(RL)フレームワークであるVeri-R1を紹介した。 LLMと検索システムとのこの動的相互作用は、現実の検証シナリオをより正確に反映し、包括的な検証スキルを育成する。実験の結果、Veri-R1は最大30%の精度で関節の精度を向上し、エビデンススコアを2倍にし、より大きなモデルを上回ることが示されている。アブレーション研究により、報酬成分の影響、および出力ロジットとラベル精度の関連が明らかにされた。本研究は,オンラインRLの正確かつ忠実なクレーム検証の有効性を強調し,今後の研究に重要な基盤を提供するものである。 LLMの権限付きクレーム検証において,コミュニティの進展をサポートするためのコードをリリースする。

論文の概要: Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

関連論文リスト