Fugu-MT 論文翻訳(概要): Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models

論文の概要: Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models

arxiv url: http://arxiv.org/abs/2511.15390v1
Date: Wed, 19 Nov 2025 12:38:21 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-20 15:51:28.800981
Title: Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models
Title（参考訳）: 専門知識の限界を突破する - 大規模言語モデルの自己実行
Authors: Haidong Kang, Lihong Lin, Enneng Yang, Hongning Dai, Hao Wang,
Abstract要約: 大規模言語モデル(LLM)は、広範囲のタスクにおいて顕著なパフォーマンスを達成しており、その巨大なサイズのため、実際のデプロイメントを妨げている。既存のプルーニング手法は手動設計プルーニングアルゴリズムに大きく依存しているため、テクティファイジの作業コストや専門知識のテクティファイアに繋がる。筆者らはまず, LLMを活用して, 専門家の知識を使わずに, 自己に最適なプルーニングアルゴリズムを設計することで, 専門家の知識限界を克服する, textbfAutoPrune と呼ばれる新しいプルーニング手法を提案する。
参考スコア（独自算出の注目度）: 21.22854931342453
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) have achieved remarkable performance on a wide range of tasks, hindering real-world deployment due to their massive size. Existing pruning methods (e.g., Wanda) tailored for LLMs rely heavily on manual design pruning algorithms, thereby leading to \textit{huge labor costs} and \textit{requires expert knowledge}. Furthermore, we are the first to identify the serious \textit{outlier value issue} behind dramatic performance degradation under high pruning ratios that are caused by uniform sparsity, raising an additional concern about how to design adaptive pruning sparsity ideal for LLMs. Can LLMs prune by themselves? In this work, we introduce an affirmative answer by proposing a novel pruning method called \textbf{AutoPrune}, which first overcomes expert knowledge limits by leveraging LLMs to design optimal pruning algorithms for themselves automatically without any expert knowledge. Specifically, to mitigate the black-box nature of LLMs, we propose a Graph-driven Chain-of-Thought (GCoT) to optimize prompts, significantly enhancing the reasoning process in learning the pruning algorithm and enabling us to generate pruning algorithms with superior performance and interpretability in the next generation. Finally, grounded in insights of outlier value issue, we introduce Skew-aware Dynamic Sparsity Allocation (SDSA) to overcome the outlier value issue, mitigating performance degradation under high pruning ratios. We conduct extensive experiments on mainstream LLMs benchmarks, demonstrating the superiority of AutoPrune, which consistently excels state-of-the-art competitors. The code is available at: https://anonymous.4open.science/r/AutoPrune.
Abstract（参考訳）: 大規模言語モデル(LLM)は、広範囲のタスクにおいて顕著なパフォーマンスを達成しており、その巨大なサイズのため、実際のデプロイメントを妨げている。 LLM用に調整された既存のプルーニング手法(例えばWanda)は、手動設計プルーニングアルゴリズムに大きく依存しているため、 \textit{huge labor cost} や \textit{requires expert knowledge} に繋がる。さらに、均一な間隔で発生する高いプルーニング比下での劇的な性能劣化の背景にある真剣な「textit{outlier value issue」を最初に特定し、LLMの適応的なプルーニング空間の理想をどう設計するかというさらなる懸念を提起する。 LLMは自分で熟成できますか? 本稿では, LLMを利用して, 専門家の知識を使わずに, 自己に最適なプルーニングアルゴリズムを設計することで, 知識限界を克服する新しいプルーニング手法である「textbf{AutoPrune}」を提案する。具体的には, LLMのブラックボックスの性質を緩和するため, グラフ駆動型チェイン・オブ・ソート(GCoT)を提案し, プロンプトを最適化し, プルーニングアルゴリズムの学習における推論過程を大幅に向上させ, 次世代の性能と解釈性に優れたプルーニングアルゴリズムを生成できるようにする。最後に,外乱値問題に対する洞察を基盤として,外乱値問題に対処するスキューア認識動的スカラー性割当(SDSA)を導入し,高い刈り取り率で性能劣化を緩和する。我々は、主要なLLMベンチマークで広範な実験を行い、常に最先端の競合に勝るAutoPruneの優位性を実証した。コードは、https://anonymous.4open.science/r/AutoPrune.comで入手できる。

論文の概要: Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models

関連論文リスト