Fugu-MT 論文翻訳(概要): Budget-Constrained Online Retrieval-Augmented Generation: The Chunk-as-a-Service Model

論文の概要: Budget-Constrained Online Retrieval-Augmented Generation: The Chunk-as-a-Service Model

arxiv url: http://arxiv.org/abs/2604.26981v1
Date: Tue, 28 Apr 2026 14:42:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-01 16:31:53.694081
Title: Budget-Constrained Online Retrieval-Augmented Generation: The Chunk-as-a-Service Model
Title（参考訳）: Budget-Constrained Online Retrieval-Augmented Generation: The Chunk-as-a-Service Model
Authors: Shawqi Al-Maliki, Ammar Gharaibeh, Mohamed Rahouti, Mohammad Ruhul Amin, Mohamed Abdallah, Junaid Qadir, Ala Al-Fuqaha,
Abstract要約: Chunk-as-a-Service (C) は RAG-as-a-Service (R) に代わる透明で費用効果の高い代替品である C には Open-Budget C (OB-C) と Limited-Budget C (LB-C) の2種類がある。
参考スコア（独自算出の注目度）: 4.573553791705522
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have revolutionized the field of natural language processing. However, they exhibit some limitations, including a lack of reliability and transparency: they may hallucinate and fail to provide sources that support the generated output. Retrieval-Augmented Generation (RAG) was introduced to address such limitations in LLMs. One popular implementation, RAG-as-a-Service (RaaS), has shortcomings that hinder its adoption and accessibility. For instance, RaaS pricing is based on the number of submitted prompts, without considering whether the prompts are enriched by relevant chunks, i.e., text segments retrieved from a vector database, or the quality of the utilized chunks (i.e., their degree of relevance). This results in an opaque and less cost-effective payment model. We propose Chunk-as-a-Service (CaaS) as a transparent and cost-effective alternative. CaaS includes two variants: Open-Budget CaaS (OB-CaaS) and Limited-Budget CaaS (LB-CaaS), which is enabled by our ``Utility-Cost Online Selection Algorithm (UCOSA)''. UCOSA further extends the cost-effectiveness and the accessibility of the OB-CaaS variant by enriching, in an online manner, a subset of the submitted prompts based on budget constraints and utility-cost tradeoff. Our experiments demonstrate the efficacy of the proposed UCOSA compared to both offline and relevance-greedy selection baselines. In terms of the performance metric-the number of enriched prompts (NEP) multiplied by the Average Relevance (AR)-UCOSA outperforms random selection by approximately 52% and achieves around 75% of the performance of offline selection methods. Additionally, in terms of budget utilization, LB-CaaS and OB-CaaS achieve higher performance-to-budget ratios of 140% and 86%, respectively, compared to RaaS, indicating their superior efficiency.
Abstract（参考訳）: 大規模言語モデル(LLM)は自然言語処理の分野に革命をもたらした。しかし、信頼性の欠如や透明性の欠如など、いくつかの制限がある。 LLMのそのような制限に対処するために、検索拡張生成(RAG)が導入された。 RaaS(RAG-as-a-Service)は、採用とアクセシビリティを妨げる欠点がある。例えば、RaaSの価格設定は、送信されたプロンプトの数に基づいており、プロンプトが関連するチャンク、すなわちベクトルデータベースから取得されたテキストセグメント、あるいは使用済みチャンクの品質(すなわち、それらの関連性度)によって濃縮されているかどうかを考慮しない。その結果、不透明で費用対効果の低い支払いモデルが生まれる。透明で費用対効果の高い代替手段として、Chunk-as-a-Service(CaaS)を提案する。 CaaSには2つのバリエーションがある: Open-Budget CaaS (OB-CaaS) と Limited-Budget CaaS (LB-CaaS) 。 UCOSAはさらに、予算の制約とユーティリティコストのトレードオフに基づいて提出されたプロンプトのサブセットをオンライン的に強化することで、OB-CaaSのコスト効率とアクセシビリティをさらに拡張する。本実験は, オフライン選択ベースラインと関連する選択ベースラインの両方と比較して, UCOSAの有効性を実証した。 Average Relevance (AR)-UCOSA が乗じるエンリッチドプロンプト(NEP)の数は、性能指標の観点で見ると、ランダム選択を約52%上回り、オフライン選択法の性能の約75%を達成している。さらに, 予算利用の面では, LB-CaaS と OB-CaaS は, RaaS と比較して, それぞれ 140% と 86% の高パフォーマンス・予算比を達成し, その効率性を示している。

論文の概要: Budget-Constrained Online Retrieval-Augmented Generation: The Chunk-as-a-Service Model

関連論文リスト