Fugu-MT 論文翻訳(概要): Hardwired-Neurons Language Processing Units as General-Purpose Cognitive Substrates

論文の概要: Hardwired-Neurons Language Processing Units as General-Purpose Cognitive Substrates

arxiv url: http://arxiv.org/abs/2508.16151v1
Date: Fri, 22 Aug 2025 07:20:19 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-25 16:42:36.287766
Title: Hardwired-Neurons Language Processing Units as General-Purpose Cognitive Substrates
Title（参考訳）: 汎用認知材料としてのハードワイヤ・ニューロン言語処理ユニット
Authors: Yang Liu, Yi Chen, Yongwei Zhao, Yifan Hao, Zifu Zheng, Weihao Kong, Zhangmai Li, Dongchen Jiang, Ruiyang Xia, Zhihong Ma, Zisheng Liu, Zhaoyong Wan, Yunqi Lu, Ximing Liu, Hongrui Guo, Zhihao Yang, Zhe Wang, Tianrui Ma, Mo Zou, Rui Zhang, Ling Li, Xing Hu, Zidong Du, Zhiwei Xu, Qi Guo, Tianshi Chen, Yunji Chen,
Abstract要約: HNLPU(Hardwired-Neurons Language Processing Unit) 金属埋め込みは、金属ワイヤの3次元トポロジーに重みパラメータを埋め込む。 HNLPUは8.57倍のコスト効率と230倍の炭素フットプリントを達成した。
参考スコア（独自算出の注目度）: 38.25739111656049
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapid advancement of Large Language Models (LLMs) has established language as a core general-purpose cognitive substrate, driving the demand for specialized Language Processing Units (LPUs) tailored for LLM inference. To overcome the growing energy consumption of LLM inference systems, this paper proposes a Hardwired-Neurons Language Processing Unit (HNLPU), which physically hardwires LLM weight parameters into the computational fabric, achieving several orders of magnitude computational efficiency improvement by extreme specialization. However, a significant challenge still lies in the scale of modern LLMs. An ideal estimation on hardwiring gpt-oss 120 B requires fabricating at least 6 billion dollars of photomask sets, rendering the straightforward solution economically impractical. Addressing this challenge, we propose the novel Metal-Embedding methodology. Instead of embedding weights in a 2D grid of silicon device cells, Metal-Embedding embeds weight parameters into the 3D topology of metal wires. This brings two benefits: (1) a 15x increase in density, and (2) 60 out of 70 layers of photomasks are made homogeneous across chips, including all EUV photomasks. In total, Metal-Embedding reduced the photomask cost by 112x, bringing the Non-Recurring Engineering (NRE) cost of HNLPU into an economically viable range. Experimental results show that HNLPU achieved 249,960 tokens/s (5,555x/85x of GPU/WSE), 36 tokens/J (1,047x/283x of GPU/WSE), 13,232 mm2 total die area (29% inscribed rectangular area in a 300 mm wafer), \$184M estimated NRE at 5 nm technology. Analysis shows that HNLPU achieved 8.57x cost-effectiveness and 230x carbon footprint reduction compared to H100 clusters, under an annual weight updating assumption.
Abstract（参考訳）: 大規模言語モデル(LLM)の急速な進歩は、LLM推論に適した特殊言語処理ユニット(LPU)の需要を推し進め、中核的な汎用認知基盤として言語を確立している。本稿では, LLMの重みパラメータを計算ファブリックに物理的にハードワイヤし, 計算効率を極端に高め, 数桁の計算効率向上を実現したHNLPU(Hardwired-Neurons Language Processing Unit)を提案する。しかし、依然として重要な課題は現代のLLMの規模にある。ハード配線gpt-oss 120Bの理想的な推定には、60億ドルのフォトマスクセットを製造する必要がある。この課題に対処するため、我々は新しいメタ・エンベディング手法を提案する。シリコンデバイスセルの2Dグリッドに重みを埋め込む代わりに、Metal-Embeddingは金属ワイヤーの3Dトポロジーに重みパラメータを埋め込む。 1)密度が15倍増加し、(2)70層のフォトマスクのうち60層が、全EUVフォトマスクを含むチップ間で均質化されている。メタル・エンベディングは、光マスクのコストを112倍に削減し、HNLPUのNon-Recurring Engineering (NRE) コストを経済的に実行可能な範囲へと引き上げた。 HNLPUは249,960トークン/s(GPU/WSE 5,555x/85x)、36トークン/J(GPU/WSE 1,047x/283x)、13,232mm2トータルダイエリア(300mmウエハの矩形面積29%)、および184MのNREを5nm技術で推定した。 HNLPUはH100クラスタに比べて8.57倍のコスト効率と230倍の炭素フットプリント削減を達成した。

論文の概要: Hardwired-Neurons Language Processing Units as General-Purpose Cognitive Substrates

関連論文リスト