Fugu-MT 論文翻訳(概要): Exploring the Limits of Pruning: Task-Specific Neurons, Model Collapse, and Recovery in Task-Specific Large Language Models

論文の概要: Exploring the Limits of Pruning: Task-Specific Neurons, Model Collapse, and Recovery in Task-Specific Large Language Models

arxiv url: http://arxiv.org/abs/2604.27115v1
Date: Wed, 29 Apr 2026 19:08:15 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-01 16:31:53.765916
Title: Exploring the Limits of Pruning: Task-Specific Neurons, Model Collapse, and Recovery in Task-Specific Large Language Models
Title（参考訳）: プルーニングの限界を探る:タスク特化ニューロン、モデル崩壊、タスク特化大言語モデルの回復
Authors: M. K. Khalidi Siam, Md. Tausif-Ul-Islam, Md. Reshad Romim Khan, Mohammed Ali Hossain, Mushfiqul Amin, Labib Hasan Khan, Niloy Farhan, Farig Sadeque,
Abstract要約: 言語モデルにおけるタスク特異的ニューロンの存在と重要性に関する実証的証拠を提供する。ターゲットタスクへの寄与が低いニューロンを識別し,ターゲットタスクの精度を維持しながらプーンする,アクティベーションベースの選択性指標を提案する。また,プルーニングの増加に伴い,パラメータと実行時VRAM使用量の一貫した削減,推論スループットの向上も観察した。
参考スコア（独自算出の注目度）: 0.259202171398714
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neuron pruning is widely used to reduce the computational cost and parameter footprint of large language models, yet it remains unclear whether neurons in task-specific models contribute uniformly to task performance. In this work, we provide empirical evidence for the existence and importance of task-specific neurons through a systematic pruning study on language models specialized for mathematical reasoning and code generation. We introduce an activation-based selectivity metric to identify neurons with low contribution to the target task and prune them while preserving target-task accuracy, and compare selective pruning with random pruning. Selective pruning consistently outperforms random pruning, indicating that activation-based selectivity provides a systematic advantage over random pruning. Reverse pruning experiments further show that removing a small subset of highly task-specific neurons (~10%) causes complete performance collapse, suggesting that there exist task specific neurons and critical task information is concentrated in a small portion of the network. In contrast, selective pruning of less critical neurons (~30% - ~35%) reduces accuracy but still preserves significant performance. We also observed consistent reductions in parameters and runtime VRAM usage, along with improved inference throughput as pruning increases. Experiments on both 1.5B and 7B models reveal a robustness threshold around 15-20% pruning, beyond which accuracy loss and generation failures increase sharply. Fine-tuning substantially recovers performance across pruning levels, particularly for aggressively pruned models. These findings provide empirical evidence of neuron specialization in task-specific language models and offer insights into pruning robustness, model redundancy, and post-pruning recoverability.
Abstract（参考訳）: ニューロロンプルーニングは大規模言語モデルの計算コストとパラメータフットプリントの削減に広く用いられているが、タスク固有モデルのニューロンがタスク性能に一様であるかどうかは不明である。本研究では、数学的推論とコード生成に特化した言語モデルに関する体系的なプルーニング研究を通じて、タスク特異的ニューロンの存在と重要性に関する実証的な証拠を提供する。ターゲットタスクへの寄与が低いニューロンを識別し、ターゲットタスクの精度を維持しながらプーンし、選択プルーニングとランダムプルーニングを比較するために、アクティベーションベースの選択性指標を導入する。選択的プルーニングはランダムプルーニングよりも一貫して優れており、アクティベーションに基づく選択性はランダムプルーニングよりも体系的な優位性をもたらすことを示している。逆プルーニング実験により、タスク固有ニューロンの小さなサブセット(~10%)を除去すると完全なパフォーマンスが崩壊し、タスク固有ニューロンが存在し、クリティカルタスク情報がネットワークのごく一部に集中していることが示唆された。対照的に、より臨界度の低いニューロン(~30% - ~35%)の選択的プルーニングは精度を低下させるが、それでも大きな性能を維持する。また,プルーニングの増加に伴い,パラメータと実行時VRAM使用量の一貫した削減,推論スループットの向上も観察した。 1.5Bモデルと7Bモデルの両方の実験では、15-20%のプルーニングで堅牢性しきい値が示され、精度の低下と生成失敗が急激に増加する。微調整はプルーニングレベル、特に積極的にプルーニングされたモデルのパフォーマンスを著しく回復させる。これらの結果は、タスク固有言語モデルにおけるニューロンの特殊化を実証的に証明し、プルーニングの堅牢性、モデル冗長性、および後プルーニングの回復性に関する洞察を与える。

論文の概要: Exploring the Limits of Pruning: Task-Specific Neurons, Model Collapse, and Recovery in Task-Specific Large Language Models

関連論文リスト