Fugu-MT 論文翻訳(概要): RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture

論文の概要: RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture

arxiv url: http://arxiv.org/abs/2401.08406v1
Date: Tue, 16 Jan 2024 14:44:47 GMT
ステータス: 翻訳完了
システム内更新日: 2024-01-17 13:50:48.557954
Title: RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
Title（参考訳）: RAG対微調整:パイプライン、トレードオフおよび農業の事例研究
Authors: Aman Gupta, Anup Shirgaonkar, Angels de Luis Balaguer, Bruno Silva, Daniel Holstein, Dawei Li, Jennifer Marsman, Leonardo O. Nunes, Mahsa Rouzbahman, Morris Sharp, Nick Mecklenburg, Rafael Padilha, Ranveer Chandra, Renato Luiz de Freitas Cunha, Roberto de M. Estev\~ao Filho, Ryan Tsang, Sara Malvar, Swati Sharma, Todd Hendry, Vijay Aski, Vijetha Vijayendran, Vinamra Benara
Abstract要約: 我々は、微調整とRAGのためのパイプラインを提案し、人気のあるLarge Language Modelのトレードオフを提示する。この結果から,データセット生成パイプラインの有効性が示唆された。
参考スコア（独自算出の注目度）: 4.248519117752213
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: There are two common ways in which developers are incorporating proprietary and domain-specific data when building applications of Large Language Models (LLMs): Retrieval-Augmented Generation (RAG) and Fine-Tuning. RAG augments the prompt with the external data, while fine-Tuning incorporates the additional knowledge into the model itself. However, the pros and cons of both approaches are not well understood. In this paper, we propose a pipeline for fine-tuning and RAG, and present the tradeoffs of both for multiple popular LLMs, including Llama2-13B, GPT-3.5, and GPT-4. Our pipeline consists of multiple stages, including extracting information from PDFs, generating questions and answers, using them for fine-tuning, and leveraging GPT-4 for evaluating the results. We propose metrics to assess the performance of different stages of the RAG and fine-Tuning pipeline. We conduct an in-depth study on an agricultural dataset. Agriculture as an industry has not seen much penetration of AI, and we study a potentially disruptive application - what if we could provide location-specific insights to a farmer? Our results show the effectiveness of our dataset generation pipeline in capturing geographic-specific knowledge, and the quantitative and qualitative benefits of RAG and fine-tuning. We see an accuracy increase of over 6 p.p. when fine-tuning the model and this is cumulative with RAG, which increases accuracy by 5 p.p. further. In one particular experiment, we also demonstrate that the fine-tuned model leverages information from across geographies to answer specific questions, increasing answer similarity from 47% to 72%. Overall, the results point to how systems built using LLMs can be adapted to respond and incorporate knowledge across a dimension that is critical for a specific industry, paving the way for further applications of LLMs in other industrial domains.
Abstract（参考訳）: 大きな言語モデル(llm)のアプリケーションを構築する際に、開発者がプロプライエタリなデータとドメイン固有のデータを組み込む一般的な方法が2つある。 RAGは外部データでプロンプトを強化し、 fine-Tuning はモデル自体に追加の知識を組み込む。しかし、両方のアプローチの長所と短所はよく理解されていない。本稿では、微調整とRAGのためのパイプラインを提案し、Llama2-13B、GPT-3.5、GPT-4を含む複数のLLMのトレードオフを示す。我々のパイプラインは,PDFから情報を取り出す,質問や回答を生成する,微調整に使用する,GPT-4を利用して結果を評価する,など,複数の段階から構成される。本稿では,RAGと微調整パイプラインの異なるステージの性能を評価する指標を提案する。農業データセットに関する詳細な研究を行っている。産業としての農業はAIの浸透をあまり見ていないが、潜在的に破壊的な応用について研究している。本研究は,地理固有知識の獲得におけるデータセット生成パイプラインの有効性と,ragと微調整の定量的・質的効果を示す。モデルを微調整すると精度が6時以上上昇し、RAGにより累積化され、さらに精度が5時まで向上する。ある特定の実験では、微調整されたモデルが特定の質問に答えるために地理的に情報を活用することを実証し、回答の類似性は47%から72%に増加した。全体として、LLMを使用して構築されたシステムは、特定の産業にとって重要な分野の知識に反応し、組み込むことができ、他の産業領域におけるLLMのさらなる応用の道を開くことができる。

関連論文リスト

OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain [62.89809156574998]
金融分野において全方向自動RAGベンチマークであるOmniEvalを導入する。我々のベンチマークは多次元評価フレームワークによって特徴づけられる。実験では、広範囲なテストデータセットを含むOmniEvalの包括性を実証した。
論文参考訳（メタデータ） (2024-12-17T15:38:42Z)
Know Your RAG: Dataset Taxonomy and Generation Strategies for Evaluating RAG Systems [18.62773754004561]
検索性能を評価するために公開質問と回答(Q&A)データセットを使用することで、最適でないシステム設計につながることを示す。本稿ではラベルとラベルをターゲットとしたデータ生成によるRAGデータセットの特徴付けに基づくソリューションを提案する。
論文参考訳（メタデータ） (2024-11-29T13:57:07Z)
Does RAG Introduce Unfairness in LLMs? Evaluating Fairness in Retrieval-Augmented Generation Systems [18.926129063000264]
RAG(Retrieval-Augmented Generation)は近年,外部知識ソースの統合能力の向上に注目が集まっている。本稿では,RAG法に適した公平度評価フレームワークを提案する。
論文参考訳（メタデータ） (2024-09-29T22:04:26Z)
RAGProbe: An Automated Approach for Evaluating RAG Applications [1.38012307221604]
Retrieval Augmented Generation (RAG)は、ジェネレーティブAIアプリケーションを構築する際にますます利用されている。本稿では,RAGパイプラインの故障を誘発する質問応答ペアのバリエーションを生成する手法を提案する。
論文参考訳（メタデータ） (2024-09-24T23:33:07Z)
What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices [91.71951459594074]
拡張コンテキストウィンドウを持つLong Language Model (LLM) は、情報抽出、質問応答、複雑な計画シナリオなどのタスクを大幅に改善した。既存のメソッドは通常、Self-Instructフレームワークを使用して、長いコンテキスト能力を改善するために命令チューニングデータを生成する。本稿では,品質検証エージェント,シングルホップ質問生成エージェント,複数質問サンプリング戦略,マルチホップ質問マーガーエージェントを組み込んだマルチエージェント対話型マルチホップ生成フレームワークを提案する。以上の結果から,我々の合成高品位長文指導データにより,多量の人体で訓練したモデルよりも,モデル性能が著しく向上することが示唆された。
論文参考訳（メタデータ） (2024-09-03T13:30:00Z)
RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs [60.38044044203333]
大規模言語モデル(LLM)は、通常、検索拡張生成(RAG)において、レトリバーからトップkコンテキストを利用する。本稿では,RAGにおける文脈ランク付けと回答生成の両目的のために,単一のLLMをチューニング可能な新しい命令微調整フレームワークであるRanRAGを提案する。例えば、GPT-4-0613, GPT-4-turbo-2024-0409, ChatQA-1.5, RAGベンチマークの最先端性能を備えたオープンソースモデルなどである。
論文参考訳（メタデータ） (2024-07-02T17:59:17Z)
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation [64.7982176398485]
Retrieval-augmented Generation (RAG)は、大規模言語モデル(LLM)の幻覚化問題を緩和する効果を実証している。本稿では,RAGシステム内での多様な知識嗜好の整合を図った汎用フレームワークであるDPA-RAGを提案する。
論文参考訳（メタデータ） (2024-06-26T18:26:53Z)
Fine-Tuning or Fine-Failing? Debunking Performance Myths in Large Language Models [0.8399688944263842]
大きな言語モデル(LLM)は、入力クエリから人間のようなテキストを理解し、生成する能力を持つ。本研究では、この概念を、レトリーバル拡張生成(RAG)パイプライン内のLLMの統合に拡張する。データ抽出と文脈理解における微調整がLLMの能力に与える影響を評価する。
論文参考訳（メタデータ） (2024-06-17T04:35:17Z)
Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases [9.478012553728538]
大規模言語モデル(LLM)の現実的精度を向上させるために,検索拡張生成(RAG)を利用するエンド・ツー・エンドのシステム設計を提案する。我々のシステムはRAGパイプラインと上流データセット処理と下流性能評価を統合している。本実験は,ドメイン固有で時間に敏感な質問に対して,より正確な回答を生成するシステムの有効性を実証する。
論文参考訳（メタデータ） (2024-03-15T16:30:14Z)
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models [49.16989035566899]
Retrieval-Augmented Generation (RAG)は、大規模言語モデル(LLM)の能力を高める技術である。本稿では,大規模かつ包括的なベンチマークを構築し,様々なRAGアプリケーションシナリオにおけるRAGシステムのすべてのコンポーネントを評価する。
論文参考訳（メタデータ） (2024-01-30T14:25:32Z)
How Can Recommender Systems Benefit from Large Language Models: A Survey [82.06729592294322]
大きな言語モデル(LLM)は、印象的な汎用知性と人間のような能力を示している。我々は,実世界のレコメンデータシステムにおけるパイプライン全体の観点から,この研究の方向性を包括的に調査する。
論文参考訳（メタデータ） (2023-06-09T11:31:50Z)
LLMs for Knowledge Graph Construction and Reasoning: Recent Capabilities and Future Opportunities [66.36633042421387]
知識グラフ(KG)の構築と推論のための大規模言語モデル(LLM)の評価。我々は,LLMと外部ソースを用いたマルチエージェントベースのアプローチであるAutoKGを提案し,KGの構築と推論を行う。
論文参考訳（メタデータ） (2023-05-22T15:56:44Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。