Fugu-MT 論文翻訳(概要): Rethinking Retrieval: From Traditional Retrieval Augmented Generation to Agentic and Non-Vector Reasoning Systems in the Financial Domain for Large Language Models

論文の概要: Rethinking Retrieval: From Traditional Retrieval Augmented Generation to Agentic and Non-Vector Reasoning Systems in the Financial Domain for Large Language Models

arxiv url: http://arxiv.org/abs/2511.18177v1
Date: Sat, 22 Nov 2025 20:06:25 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-25 18:34:24.673237
Title: Rethinking Retrieval: From Traditional Retrieval Augmented Generation to Agentic and Non-Vector Reasoning Systems in the Financial Domain for Large Language Models
Title（参考訳）: Retrievalの再考: 大規模言語モデルのための金融領域における従来のRetrieval Augmented Generationからエージェントおよび非ベクトル推論システムへ
Authors: Elias Lumer, Matt Melich, Olivia Zino, Elena Kim, Sara Dieter, Pradeep Honaganahalli Basavaraju, Vamse Kumar Subbiah, James A. Burke, Roberto Hernandez,
Abstract要約: 本稿では,ベクトルベースエージェントRAGをハイブリッド検索とメタデータフィルタリングを用いて比較した最初の体系的評価を行う。検索指標(MRR, Recall@5), LLM-as-a-judgeのペア比較, レイテンシ, 前処理コストを計測する。以上の結果から,金融Q&Aシステムに先進的なRAG技術を適用することにより,検索精度,回答品質が向上し,生産における費用対効果のトレードオフが考慮されることが明らかとなった。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements in Retrieval-Augmented Generation (RAG) have enabled Large Language Models to answer financial questions using external knowledge bases of U.S. SEC filings, earnings reports, and regulatory documents. However, existing work lacks systematic comparison of vector-based and non-vector RAG architectures for financial documents, and the empirical impact of advanced RAG techniques on retrieval accuracy, answer quality, latency, and cost remain unclear. We present the first systematic evaluation comparing vector-based agentic RAG using hybrid search and metadata filtering against hierarchical node-based systems that traverse document structure without embeddings. We evaluate two enhancement techniques applied to the vector-based architecture, i) cross-encoder reranking for retrieval precision, and ii) small-to-big chunk retrieval for context completeness. Across 1,200 SEC 10-K, 10-Q, and 8-K filings on a 150-question benchmark, we measure retrieval metrics (MRR, Recall@5), answer quality through LLM-as-a-judge pairwise comparisons, latency, and preprocessing costs. Vector-based agentic RAG achieves a 68% win rate over hierarchical node-based systems with comparable latency (5.2 compared to 5.98 seconds). Cross-encoder reranking achieves a 59% absolute improvement at optimal parameters (10, 5) for MRR@5. Small-to-big retrieval achieves a 65% win rate over baseline chunking with only 0.2 seconds additional latency. Our findings reveal that applying advanced RAG techniques to financial Q&A systems improves retrieval accuracy, answer quality, and has cost-performance tradeoffs to be considered in production.
Abstract（参考訳）: Retrieval-Augmented Generation (RAG) の最近の進歩により、大規模言語モデルは、米国証券取引委員会(SEC)の申請書、決算報告、および規制文書の外部知識ベースを使用して、金銭的問題に答えることができるようになった。しかし,既存の文書ではベクトルベースと非ベクトルRAGアーキテクチャの体系的比較が欠如しており,検索精度,応答品質,レイテンシ,コストに対する高度なRAG手法の実証的影響はいまだ不明である。本稿では, 文書構造を組込せずに横断する階層型ノードベースシステムに対して, ハイブリッド検索とメタデータフィルタリングを用いて, ベクトルベースエージェントRAGを比較した最初の体系的評価を行う。ベクトルベースアーキテクチャに適用した2つの拡張手法を評価する。一検索精度を優先するクロスエンコーダ二文脈完全性のための小さいから大きいチャンク検索 1200 SEC 10-K, 10-Q, 8-K の 150-question のベンチマークでは,検索指標 (MRR, Recall@5), LLM-as-a-judge のペア比較による回答品質,レイテンシ,前処理コストを計測した。ベクトルベースのエージェントRAGは、レイテンシ(5.98秒に比べて5.2秒)の階層的なノードベースシステムよりも68%の勝利率を達成する。クロスエンコーダは MRR@5 に対して最適パラメータ (10, 5) で 59% の絶対改善を達成している。小さいから大きな検索では、ベースラインチャンクよりも65%の勝利率を実現し、0.2秒追加のレイテンシを実現している。以上の結果から,金融Q&Aシステムに先進的なRAG技術を適用することにより,検索精度,回答品質が向上し,生産における費用対効果のトレードオフが考慮されることが明らかとなった。

論文の概要: Rethinking Retrieval: From Traditional Retrieval Augmented Generation to Agentic and Non-Vector Reasoning Systems in the Financial Domain for Large Language Models

関連論文リスト