Fugu-MT 論文翻訳(概要): Evergreen: Efficient Claim Verification for Semantic Aggregates

論文の概要: Evergreen: Efficient Claim Verification for Semantic Aggregates

arxiv url: http://arxiv.org/abs/2604.26180v1
Date: Tue, 28 Apr 2026 23:55:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-30 15:59:36.201596
Title: Evergreen: Efficient Claim Verification for Semantic Aggregates
Title（参考訳）: Evergreen: セマンティックアグリゲートの効率的なクレーム検証
Authors: Alexander W. Lee, Benjamin Han, Shayak Sen, Sam Yeom, Ugur Cetintemel, Anupam Datta,
Abstract要約: セマンティッククエリ処理タスクとしてクレーム検証をリキャストするシステムであるEvergreenを提案する。 Everettは、各クレームを宣言的なセマンティック検証クエリにコンパイルし、アグリゲートを生成する同じエンジン上で実行する。高いLCMで優れた検証品質(F1 = 1.00)を実現し、未最適化の検証に比べてコストを3.2倍、レイテンシを4.0倍削減する。
参考スコア（独自算出の注目度）: 41.757837386071074
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With recent semantic query processing engines, semantic aggregation has become a primitive operator, enabling the reduction of a relation into a natural language aggregate using an LLM. However, the resulting semantic aggregate may contain claims that are not grounded in the underlying relation. Verifying such claims is challenging: they often involve quantifiers, groupings, and comparisons over relations that far exceed LLM context windows and require a costly combination of semantic and symbolic processing. We present Evergreen, a system that recasts claim verification as a semantic query processing task with tailored optimizations and provenance capture. Evergreen compiles each claim into a declarative semantic verification query and executes it on the same engine that produced the aggregate. To reduce cost and latency, Evergreen avoids unnecessary LLM calls through verification-aware optimizations (early stopping, relevance sorting, and estimation with confidence sequences) and general-purpose optimizations for semantic queries (operator fusion, similarity filtering, and prompt caching). Each verdict is accompanied by citations that identify a minimal set of tuples justifying the result, with semantics based on semiring provenance for first-order logic. On a benchmark of real-world restaurant review datasets reflecting production-inspired workloads, Evergreen achieves excellent verification quality (F1 = 1.00) with a strong LLM while reducing cost by 3.2x and latency by 4.0x compared to unoptimized verification. Even with a significantly weaker LLM, Evergreen outperforms a strong LLM-as-a-judge baseline in F1 at 48x lower cost and 2.3x lower latency. Relative to a retrieval-augmented agent, Evergreen compares favorably in F1 and latency with similar cost when both use a strong LLM; yet, with a much weaker LLM, it achieves the same F1 at 63x lower cost and 4.2x lower latency.
Abstract（参考訳）: 最近のセマンティッククエリ処理エンジンでは、セマンティックアグリゲーションがプリミティブ演算子となり、LLMを用いて関係を自然言語アグリゲーションに還元することができる。しかし、結果のセマンティック・アグリゲーションは、基礎となる関係に基づかない主張を含むかもしれない。量子化器、グループ化、およびLLMコンテキストウインドウをはるかに越え、セマンティック処理とシンボリック処理のコストのかかる組み合わせを必要とする関係の比較を含むことが多い。提案するEvergreenは,クレーム検証をセマンティッククエリ処理タスクとして再キャストするシステムである。 Evergreenは、各クレームを宣言的なセマンティック検証クエリにコンパイルし、アグリゲートを生成する同じエンジン上で実行する。コストとレイテンシを低減するため、Evergreenは検証対応の最適化(早期停止、関連ソート、信頼シーケンスによる推定)とセマンティッククエリ(オペレータフュージョン、類似度フィルタリング、即時キャッシュ)の汎用最適化を通じて不要なLCM呼び出しを避ける。各評定には、結果を正当化する最小限のタプルの集合を識別する引用が伴い、一階述語論理の半順序証明に基づく意味論が伴う。プロダクションにインスパイアされたワークロードを反映した実際のレストランレビューデータセットのベンチマークでは、Evergreenは強力なLCMで優れた検証品質(F1 = 1.00)を達成すると同時に、コストを3.2倍、レイテンシを4.0倍削減する。かなり弱いLDMでも、EvergreenはF1の強力なLCM-as-a-judgeベースラインを48倍の低コスト、2.3倍のレイテンシで上回る。検索強化エージェントとは対照的に、Evergreenは強力なLDMを使用する場合のF1とレイテンシを同様のコストで比較するが、より弱いLSMでは、同じF1を63倍の低コストで4.2倍のレイテンシで達成する。

論文の概要: Evergreen: Efficient Claim Verification for Semantic Aggregates

関連論文リスト