Fugu-MT 論文翻訳(概要): Fast Byte Latent Transformer

論文の概要: Fast Byte Latent Transformer

arxiv url: http://arxiv.org/abs/2605.08044v1
Date: Fri, 08 May 2026 17:35:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-11 19:43:39.244156
Title: Fast Byte Latent Transformer
Title（参考訳）: Fast Byte Latent Transformer
Authors: Julie Kallini, Artidoro Pagnoni, Tomasz Limisiewicz, Gargi Ghosh, Luke Zettlemoyer, Christopher Potts, Xiaochuang Han, Srinivasan Iyer,
Abstract要約: 我々は,BLT拡散(BLT-D)という新しいモデルを導入し,次世代の予測損失と並行して,ブロック単位の拡散目標を訓練した。第二に、この速度の一部を高い世代品質で交換する投機的復号法にインスパイアされた2つの拡張を提案する。全ての方法は、生成タスクにおけるBLTよりも50%以上低いメモリ帯域幅のコストを達成することができる。
参考スコア（独自算出の注目度）: 73.03308456251764
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent byte-level language models (LMs) match the performance of token-level models without relying on subword vocabularies, yet their utility is limited by slow, byte-by-byte autoregressive generation. We address this bottleneck in the Byte Latent Transformer (BLT) through new training and generation techniques. First, we introduce BLT Diffusion (BLT-D), a new model and our fastest BLT variant, trained with an auxiliary block-wise diffusion objective alongside the standard next-byte prediction loss. This enables an inference procedure that generates multiple bytes in parallel per decoding step, substantially reducing the number of forward passes required to generate a sequence. Second, we propose two extensions inspired by speculative decoding that trade some of this speed for higher generation quality: BLT Self-speculation (BLT-S), in which BLT's local decoder continues generating past its normal patch boundaries to draft bytes, which are then verified with a single full-model forward pass; and BLT Diffusion+Verification (BLT-DV), which augments BLT-D with an autoregressive verification step after diffusion-based generation. All methods may achieve an estimated memory-bandwidth cost over 50% lower than BLT on generation tasks. Each approach offers its own unique advantages, together removing key barriers to the practical use of byte-level LMs.
Abstract（参考訳）: 近年のバイトレベル言語モデル (LM) はサブワード語彙に頼らずにトークンレベルのモデルの性能にマッチするが、その効用は遅いバイト単位の自己回帰生成によって制限される。 Byte Latent Transformer (BLT) では,このボトルネックに新たなトレーニングと生成技術を用いて対処する。まず,新しいモデルであるBLT拡散(BLT-D)を導入する。これにより、復号ステップ毎に複数のバイトを並列に生成する推論手順が可能となり、シーケンスを生成するのに必要なフォワードパスの数を大幅に削減できる。次に、BLTのローカルデコーダが通常のパッチ境界を過ぎてドラフトバイトを生成するBLT Self-speculation (BLT-S) と、BLT Diffusion+Verification (BLT-DV) と、BLT-Dを拡散ベース生成後の自己回帰検証ステップで拡張するBLT Diffusion+Verification (BLT-DV) の2つの拡張を提案する。全ての方法は、生成タスクにおけるBLTよりも50%以上低いメモリ帯域幅のコストを達成することができる。それぞれのアプローチは独自のアドバンテージを提供し、共にバイトレベルのLMを実践する上で重要な障壁を取り除く。

論文の概要: Fast Byte Latent Transformer

関連論文リスト