Fugu-MT 論文翻訳(概要): Large Language Models: A Mathematical Formulation

論文の概要: Large Language Models: A Mathematical Formulation

arxiv url: http://arxiv.org/abs/2601.22170v1
Date: Wed, 21 Jan 2026 21:22:49 GMT
ステータス: 情報取得中
システム内更新日: 2026-02-08 13:03:29.240038
Title: Large Language Models: A Mathematical Formulation
Title（参考訳）: 大規模言語モデル: 数学的定式化
Authors: Ricardo Baptista, Andrew Stuart, Son Tran,
Abstract要約: 大規模言語モデル(LLM)は、質問に答えるテキストを含むシーケンスを処理し、予測する。トークン列へのテキストシーケンスの符号化を記述することで,LLMの数学的枠組みを提供する。これらのモデルがデータからどのように学習されるかを説明し、さまざまなタスクにどのようにデプロイされるかを示す。
参考スコア（独自算出の注目度）: 9.837462698662947
License:
Abstract: Large language models (LLMs) process and predict sequences containing text to answer questions, and address tasks including document summarization, providing recommendations, writing software and solving quantitative problems. We provide a mathematical framework for LLMs by describing the encoding of text sequences into sequences of tokens, defining the architecture for next-token prediction models, explaining how these models are learned from data, and demonstrating how they are deployed to address a variety of tasks. The mathematical sophistication required to understand this material is not high, and relies on straightforward ideas from information theory, probability and optimization. Nonetheless, the combination of ideas resting on these different components from the mathematical sciences yields a complex algorithmic structure; and this algorithmic structure has demonstrated remarkable empirical successes. The mathematical framework established here provides a platform from which it is possible to formulate and address questions concerning the accuracy, efficiency and robustness of the algorithms that constitute LLMs. The framework also suggests directions for development of modified and new methodologies.
Abstract（参考訳）: 大規模言語モデル(LLM)は、質問に回答するテキストを含むシーケンスを処理し、文書の要約、レコメンデーションの提供、ソフトウェアの作成、定量的問題の解決などのタスクに対処する。トークンの列へのテキストシーケンスのエンコーディングを記述し、次に学習した予測モデルのアーキテクチャを定義し、これらのモデルがデータからどのように学習されるかを説明し、様々なタスクにどのようにデプロイされるかを説明することで、LCMの数学的フレームワークを提供する。この資料を理解するのに必要な数学的洗練は高くはなく、情報理論、確率、最適化からの素直なアイデアに依存している。それにもかかわらず、これらの異なる要素を数学的科学と組み合わせることで複雑なアルゴリズム構造が得られ、このアルゴリズム構造は顕著な経験的成功を証明している。ここで確立された数学的枠組みは、LLMを構成するアルゴリズムの正確性、効率、堅牢性に関する疑問を定式化し、解決することのできるプラットフォームを提供する。フレームワークはまた、修正された新しい方法論を開発するための方向も提案している。

論文の概要: Large Language Models: A Mathematical Formulation

関連論文リスト