Fugu-MT 論文翻訳(概要): LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization

論文の概要: LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization

arxiv url: http://arxiv.org/abs/2602.03690v1
Date: Tue, 03 Feb 2026 16:08:33 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-04 18:37:15.564791
Title: LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization
Title（参考訳）: LLMにインスパイアされた小型大規模最適化のためのプレトレイン-Then-Finetune
Authors: Zishi Zhang, Jinhui Han, Ming Hu, Yijie Peng,
Abstract要約: 我々は、企業が同時に多くの運用上の決定をしなければならない、小規模で大規模な意思決定問題を考える。本稿では,この課題に対処するために,設計したトランスフォーマーモデル上に構築したプレトレイン-then-finetuneアプローチを提案する。
参考スコア（独自算出の注目度）: 7.8639568562295965
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider small-data, large-scale decision problems in which a firm must make many operational decisions simultaneously (e.g., across a large product portfolio) while observing only a few, potentially noisy, data points per instance. Inspired by the success of large language models (LLMs), we propose a pretrain-then-finetune approach built on a designed Transformer model to address this challenge. The model is first pretrained on large-scale, domain-informed synthetic data that encode managerial knowledge and structural features of the decision environment, and is then fine-tuned on real observations. This new pipeline offers two complementary advantages: pretraining injects domain knowledge into the learning process and enables the training of high-capacity models using abundant synthetic data, while finetuning adapts the pretrained model to the operational environment and improves alignment with the true data-generating regime. While we have leveraged the Transformer's state-of-the-art representational capacity, particularly its attention mechanism, to efficiently extract cross-task structure, our approach is not an off-the-shelf application. Instead, it relies on problem-specific architectural design and a tailored training procedure to match the decision setting. Theoretically, we develop the first comprehensive error analysis regarding Transformer learning in relevant contexts, establishing nonasymptotic guarantees that validate the method's effectiveness. Critically, our analysis reveals how pretraining and fine-tuning jointly determine performance, with the dominant contribution governed by whichever is more favorable. In particular, finetuning exhibits an economies-of-scale effect, whereby transfer learning becomes increasingly effective as the number of instances grows.
Abstract（参考訳）: 私たちは、企業が複数の運用上の決定を同時に行わなければならない、小さなデータ、大規模な意思決定の問題(例:大規模な製品ポートフォリオ全体)について検討し、インスタンス毎に数個のノイズのあるデータポイントのみを観察します。大規模言語モデル (LLM) の成功に触発されて, 設計した Transformer モデル上に構築された訓練前ファイントゥンアプローチを提案し, この問題に対処する。このモデルは、まず、意思決定環境の管理的知識と構造的特徴を符号化した大規模でドメインインフォームドな合成データに基づいて事前訓練され、その後、実際の観測に基づいて微調整される。事前学習は、学習プロセスにドメイン知識を注入し、豊富な合成データを使用して高容量モデルのトレーニングを可能にする一方で、微調整は、事前学習されたモデルを運用環境に適応させ、真のデータ生成体制との整合性を改善する。我々はTransformerの最先端表現能力、特にその注意機構を活用して、クロスタスク構造を効率的に抽出するが、本手法は市販のアプリケーションではない。代わりに、問題固有のアーキテクチャ設計と、決定設定に合うように調整されたトレーニング手順に依存します。理論的には、トランスフォーマー学習に関する最初の包括的エラー解析を開発し、その方法の有効性を検証する漸近的保証を確立する。批判的に、我々の分析は、事前学習と微調整が共同でパフォーマンスを決定する方法を明らかにする。特に、ファインタニングはスケールの経済効果を示し、インスタンスの数が増えるにつれて、転送学習がますます効果的になる。

論文の概要: LLM-Inspired Pretrain-Then-Finetune for Small-Data, Large-Scale Optimization

関連論文リスト