Fugu-MT 論文翻訳(概要): Adaptation of Large Language Models

論文の概要: Adaptation of Large Language Models

arxiv url: http://arxiv.org/abs/2504.03931v1
Date: Fri, 04 Apr 2025 20:57:41 GMT
ステータス: 翻訳完了
システム内更新日: 2025-04-17 07:24:08.939666
Title: Adaptation of Large Language Models
Title（参考訳）: 大規模言語モデルの適応
Authors: Zixuan Ke, Yifei Ming, Shafiq Joty,
Abstract要約: LLMの適応に関するこのチュートリアルは、ジェネリックLLMの静的能力を超えたモデルの需要増加に対応するために設計されている。まず, LLMにおけるパラメトリック知識の更新に焦点を当てたパラメトリック知識適応について検討する。 2つめの適応は、半パラメトリックな知識適応であり、その目標は、外部の知識やツールをよりよく活用するために、LSMパラメータを更新することである。
参考スコア（独自算出の注目度）: 39.59753447841243
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: This tutorial on adaptation of LLMs is designed to address the growing demand for models that go beyond the static capabilities of generic LLMs by providing an overview of dynamic, domain-specific, and task-adaptive LLM adaptation techniques. While general LLMs have demonstrated strong generalization across a variety of tasks, they often struggle to perform well in specialized domains such as finance, healthcare, and code generation for underrepresented languages. Additionally, their static nature limits their ability to evolve with the changing world, and they are often extremely large in size, making them impractical and costly to deploy at scale. As a result, the adaptation of LLMs has drawn much attention since the birth of LLMs and is of core importance, both for industry, which focuses on serving its targeted users, and academia, which can greatly benefit from small but powerful LLMs. To address this gap, this tutorial aims to provide an overview of the LLM adaptation techniques. We start with an introduction to LLM adaptation, from both the data perspective and the model perspective. We then emphasize how the evaluation metrics and benchmarks are different from other techniques. After establishing the problems, we explore various adaptation techniques. We categorize adaptation techniques into two main families. The first is parametric knowledge adaptation, which focuses on updating the parametric knowledge within LLMs. Additionally, we will discuss real-time adaptation techniques, including model editing, which allows LLMs to be updated dynamically in production environments. The second kind of adaptation is semi-parametric knowledge adaptation, where the goal is to update LLM parameters to better leverage external knowledge or tools through techniques like retrieval-augmented generation (RAG) and agent-based systems.
Abstract（参考訳）: LLMの適応に関するこのチュートリアルは、動的、ドメイン固有、タスク適応LLM適応技術の概要を提供することで、ジェネリックLLMの静的能力を超えたモデルの需要の増加に対応するために設計されている。一般のLLMは様々なタスクにまたがって強力な一般化を示してきたが、金融、医療、および表現不足言語のためのコード生成といった専門分野ではうまく機能しないことが多い。さらに、彼らの静的な性質は、変化する世界と共に進化する能力を制限するものであり、しばしば非常に大きなサイズであるため、大規模に展開する上で非現実的でコストがかかる。結果として、LDMの適応はLLMの誕生以来注目され、そのターゲットとなるユーザへのサービスに焦点をあてる産業と、小規模で強力なLDMの恩恵を受ける学術の両方において重要視されている。このギャップに対処するため,本チュートリアルはLLM適応技術の概要を提供する。まず、データパースペクティブとモデルパースペクティブの両方から、LLM適応の導入から始めます。次に、評価指標とベンチマークが他の手法とどのように異なるかを強調します。課題を解決した後、様々な適応手法を探求する。適応テクニックを2つのメインファミリーに分類する。 1つ目はパラメトリック知識適応であり、LLM内のパラメトリック知識の更新に焦点を当てている。さらに,実運用環境においてLLMを動的に更新できるモデル編集など,リアルタイム適応技術についても論じる。第2の適応は、半パラメトリックな知識適応であり、LLMパラメータを更新して、検索強化生成(RAG)やエージェントベースのシステムといった技術を通じて、外部の知識やツールをよりよく活用することを目的としている。

論文の概要: Adaptation of Large Language Models

関連論文リスト