Fugu-MT 論文翻訳(概要): From Articles to Premises: Building PrimeFacts, an Extraction Methodology and Resource for Fact-Checking Evidence

論文の概要: From Articles to Premises: Building PrimeFacts, an Extraction Methodology and Resource for Fact-Checking Evidence

arxiv url: http://arxiv.org/abs/2605.06006v1
Date: Thu, 07 May 2026 10:58:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:11.709802
Title: From Articles to Premises: Building PrimeFacts, an Extraction Methodology and Resource for Fact-Checking Evidence
Title（参考訳）: プライムファクトの構築とFact-Checkingエビデンスのための抽出方法と資源
Authors: Premtim Sahitaj, Jawan Kolanowski, Ariana Sahitaj, Veronika Solopova, Max Upravitelev, Daniel Röder, Iffat Maab, Junichi Yamagishi, Sebastian Möller, Vera Schmitt,
Abstract要約: PrimeFactsは、完全な事実チェック記事からきめ細かい証拠を抽出する方法論である。我々は13,106のPoitiFact記事にクレーム、評決、およびすべての参照ソースをまとめる。私たちのフレームワークは、アンカー文をスタンドアローンでコンテキストに依存しない前提に書き直します。
参考スコア（独自算出の注目度）: 27.242475349674155
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fact-checking articles encode rich supporting evidence and reasoning, yet this evidence remains largely inaccessible to automated verification systems due to unstructured presentation. We introduce PrimeFacts, a methodology and resource for extracting fine-grained evidence from full fact-checking articles. We compile 13,106 PolitiFact articles with claims, verdicts, and all referenced sources, and we identify 49,718 in-article hyperlinks as natural anchors to pinpoint key evidence. Our framework leverages large language models (LLMs) to rewrite these anchor sentences into stand-alone, context-independent premises and investigates the extraction of additional implicit evidence. In evaluations on cross-article evidence retrieval and claim verification, the extracted premises substantially improve performance. Decontextualized evidence yields higher retrievability, achieving up to a 30 percent relative gain in Mean Reciprocal Rank over verbatim sentences, and using the evidence for verdict prediction raises Macro-F1 by 10-20 points over the baseline. These gains are consistent across different verdict granularities (2-class vs. 5-class) and model architectures. A qualitative analysis indicates that the decontextualized premises remain faithful to the original sources. Our work highlights the promise of reusing fact-checkers' evidence for automation and provides a large-scale resource of structured evidence from real-world fact-checks.
Abstract（参考訳）: Fact-checking article encodeen rich support evidence and reasoning しかし、この証拠は、構造化されていないプレゼンテーションのため、自動検証システムにはほとんどアクセスできない。完全事実確認記事から詳細な証拠を抽出するための方法論と資源であるPrimeFactsを紹介する。我々は13,106のPoitiFact記事にクレーム、評定、およびすべての参考資料をまとめ、49,718個の内包ハイパーリンクをキー証拠をピンポイントする自然なアンカーとして同定する。我々のフレームワークは大規模言語モデル(LLM)を活用して、これらのアンカー文を独立した文脈に依存しない前提に書き換え、付加的な暗黙的証拠の抽出を調査する。クロスアーティクルエビデンス検索とクレーム検証の評価において,抽出された施設の性能は著しく向上した。デコンテクライズド・エビデンス(decontextualized evidence)は、検索可能性を高め、動詞の文よりも平均相反ランクが最大30%向上し、検証予測のエビデンスを使用することで、マクロ-F1はベースライン上で10-20ポイント上昇する。これらの利得は、異なる検証の粒度(2クラス対5クラス)とモデルアーキテクチャに一貫性がある。質的な分析は、非コンテクスト化された前提が元の情報源に忠実であることを示している。我々の研究は、ファクトチェッカーの自動化に関する証拠を再利用することの約束を強調し、現実世界のファクトチェッカーから構造化された証拠の大規模リソースを提供する。

論文の概要: From Articles to Premises: Building PrimeFacts, an Extraction Methodology and Resource for Fact-Checking Evidence

関連論文リスト