Fugu-MT 論文翻訳(概要): ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

論文の概要: ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

arxiv url: http://arxiv.org/abs/2605.03667v1
Date: Tue, 05 May 2026 12:04:51 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-06 19:35:43.920793
Title: ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity
Title（参考訳）: ELAS: 2:4 Activation Sparsityによる低ランク大規模言語モデルの効率的な事前学習
Authors: Jiaxi Li, Lu Yin, Li Shen, Jinjin Xu, Yuhui Liu, Wenwu Wang, Shiwei Liu, Xilu Wang,
Abstract要約: ELAS: 2:4 Activation Sparsityによる低ランクLCMの効率的な事前トレーニング。本稿では,2:4 Activation Sparsity による低ランク LLM の効率的な事前学習を提案する。
参考スコア（独自算出の注目度）: 30.15914091924631
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have achieved remarkable capabilities, but their immense computational demands during training remain a critical bottleneck for widespread adoption. Low-rank training has received attention in recent years due to its ability to significantly reduce training memory usage. Meanwhile, applying 2:4 structured sparsity to weights and activations to leverage NVIDIA GPU support for 2:4 structured sparse format has become a promising direction. However, existing low-rank methods often leave activation matrices in full-rank, which dominates memory consumption and limits throughput during large-batch training. Furthermore, directly applying sparsity to weights often leads to non-negligible performance degradation. To achieve efficient pre-training of LLMs, this paper proposes ELAS: Efficient pre-training of Low-rank LLMs via 2:4 Activation Sparsity, a novel framework for low-rank models via 2:4 activation sparsity. ELAS applies squared ReLU activation functions to the feed-forward networks in low-rank models and implements 2:4 structured sparsity on the activations after the squared ReLU operation. We evaluated ELAS through pre-training experiments on LLaMA models ranging from 60M to 1B parameters. The results demonstrate that ELAS maintains performance with minimal degradation after applying 2:4 activation sparsity, while achieving training and inference acceleration. Moreover, ELAS reduces activation memory overhead, particularly with large batch sizes. Code is available at ELAS Repo.
Abstract（参考訳）: 大規模言語モデル(LLM)は目覚ましい能力を達成したが、訓練中の膨大な計算要求は、広く普及する上で重要なボトルネックである。近年の低ランクトレーニングは、トレーニングメモリ使用量を大幅に削減する能力から注目されている。一方、2:4構造化されたスパースフォーマットのNVIDIA GPUサポートを活用するために、重みとアクティベーションに2:4構造化された間隔を適用することが、有望な方向となっている。しかし、既存の低ランク手法では、アクティベーション行列をフルランクに残すことが多く、これはメモリ消費を支配し、大規模バッチトレーニングのスループットを制限する。さらに、重みに直接スパーシリティを適用すると、しばしば非無視的な性能低下につながる。本稿では,2:4アクティベーション・スパシティによる低ランクモデルのための新しいフレームワークである2:4アクティベーション・スパシティによる低ランクLCMの効率的な事前トレーニングを提案する。 ELASは低ランクモデルのフィードフォワードネットワークに正方形ReLU活性化関数を適用し、正方形ReLU動作後のアクティベーションに2:4の構造的間隔を実装する。我々は,60Mから1BパラメータのLLaMAモデルの事前学習実験によりELASを評価した。その結果、ELASは2:4のアクティベーション間隔を適用した後に最小限の劣化を抑えつつ、トレーニングと推論の加速を達成しつつ、性能を維持できることを示した。さらにELASは、特に大きなバッチサイズで、アクティベーションメモリのオーバーヘッドを低減する。コードはELAS Repoで入手できる。

論文の概要: ELAS: Efficient Pre-Training of Low-Rank Large Language Models via 2:4 Activation Sparsity

関連論文リスト