Fugu-MT 論文翻訳(概要): INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

論文の概要: INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

arxiv url: http://arxiv.org/abs/2510.25602v1
Date: Wed, 29 Oct 2025 15:11:53 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-30 15:50:45.781619
Title: INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats
Title（参考訳）: INT v.s. FP: 微粒化低ビット量子化フォーマットの総合的研究
Authors: Mengzhao Chen, Meng Wu, Hui Jin, Zhihang Yuan, Jing Liu, Chaoyi Zhang, Yunshui Li, Jie Huang, Jin Ma, Zeyue Xue, Zhiheng Liu, Xingyan Bin, Ping Luo,
Abstract要約: NvidiaのBlackwellアーキテクチャのような現代のAIハードウェアは、低精度浮動小数点(FP)フォーマットをますます受け入れている。本稿では,FPフォーマットと整数(INT)フォーマットのトレードオフを系統的に検討する。 FPは粗粒度量子化に優れるが、きめ細かい(ブロックワイド)レベルでの比較はよりニュアンスが高い。
参考スコア（独自算出の注目度）: 51.72056104795248
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern AI hardware, such as Nvidia's Blackwell architecture, is increasingly embracing low-precision floating-point (FP) formats to handle the pervasive activation outliers in Large Language Models (LLMs). Despite this industry trend, a unified comparison of FP and integer (INT) quantization across varying granularities has been missing, leaving algorithm and hardware co-design without clear guidance. This paper fills that gap by systematically investigating the trade-offs between FP and INT formats. We reveal a critical performance crossover: while FP excels in coarse-grained quantization, the comparison at fine-grained (block-wise) levels is more nuanced. Our comprehensive comparison demonstrates that for popular 8-bit fine-grained formats (e.g., MX with block size 32), MXINT8 is superior to its FP counterpart in both algorithmic accuracy and hardware efficiency. However, for 4-bit formats, FP (e.g., MXFP4, NVFP4) often holds an accuracy advantage , though we show that NVINT4 can surpass NVFP4 when outlier-mitigation techniques like Hadamard rotation are applied. We also introduce a symmetric clipping method that resolves gradient bias in fine-grained low-bit INT training, enabling nearly lossless performance for MXINT8 training. These findings challenge the current hardware trajectory, demonstrating that a one-size-fits-all FP approach is suboptimal and advocating that fine-grained INT formats, particularly MXINT8, offer a better balance of accuracy, power, and efficiency for future AI accelerators.
Abstract（参考訳）: NvidiaのBlackwellアーキテクチャのような現代のAIハードウェアは、Large Language Models (LLMs)における広範囲なアクティベーションアウトリーを処理するために、低精度浮動小数点(FP)フォーマットをますます受け入れている。この業界動向にもかかわらず、FPと整数量子化(INT)の様々な粒度に対する統一的な比較は欠落しており、アルゴリズムとハードウェアの共設計は明確なガイダンスは残っていない。本稿では,FPフォーマットとINTフォーマットのトレードオフを体系的に検討することによって,そのギャップを埋める。 FPは粗粒度量子化に優れるが、きめ細かい(ブロックワイド)レベルでの比較はよりニュアンスが高い。我々の総合的な比較では、一般的な8ビットのきめ細かいフォーマット(例えば、ブロックサイズ32のMX)では、MXINT8はアルゴリズムの精度とハードウェア効率の両方においてFPよりも優れていることが示されている。しかし、4ビットフォーマットではFP(eg , MXFP4, NVFP4)が精度上の優位性を持つことが多い。また、細粒度低ビットINTトレーニングにおける勾配バイアスを解消し、MXINT8トレーニングにおけるほとんどロスレス性能を実現する対称クリッピング法を提案する。これらの発見は、現在のハードウェア軌道に挑戦し、一大のFPアプローチが最適以下であることを示し、特にMXINT8のようなきめ細かいINTフォーマットが、将来のAIアクセラレーターの精度、パワー、効率のバランスを改善することを提唱している。

論文の概要: INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

関連論文リスト