Fugu-MT 論文翻訳(概要): Switchable Activation Networks

論文の概要: Switchable Activation Networks

arxiv url: http://arxiv.org/abs/2603.06601v1
Date: Tue, 17 Feb 2026 08:14:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-15 16:38:22.415137
Title: Switchable Activation Networks
Title（参考訳）: スイッチング・アクティベーション・ネットワーク
Authors: Laha Ale, Ning Zhang, Scott A. King, Pingzhi Fan,
Abstract要約: SWAN(Switchable Activation Networks)は、各神経ユニットに決定論的、入力依存のバイナリゲートを装備するフレームワークである。この動的制御機構は計算を適応的に割り当て、精度を維持しながら冗長性を低減させる。 SWANは、学習活性化制御の課題として効率性を再定義することにより、スパーシティ、プルーニング、適応推論の強みを統一する。
参考スコア（独自算出の注目度）: 25.854474675239842
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep neural networks, and more recently large-scale generative models such as large language models (LLMs) and large vision-action models (LVAs), achieve remarkable performance across diverse domains, yet their prohibitive computational cost hinders deployment in resource-constrained environments. Existing efficiency techniques offer only partial remedies: dropout improves regularization during training but leaves inference unchanged, while pruning and low-rank factorization compress models post hoc into static forms with limited adaptability. Here we introduce SWAN (Switchable Activation Networks), a framework that equips each neural unit with a deterministic, input-dependent binary gate, enabling the network to learn when a unit should be active or inactive. This dynamic control mechanism allocates computation adaptively, reducing redundancy while preserving accuracy. Unlike traditional pruning, SWAN does not simply shrink networks after training; instead, it learns structured, context-dependent activation patterns that support both efficient dynamic inference and conversion into compact dense models for deployment. By reframing efficiency as a problem of learned activation control, SWAN unifies the strengths of sparsity, pruning, and adaptive inference within a single paradigm. Beyond computational gains, this perspective suggests a more general principle of neural computation, where activation is not fixed but context-dependent, pointing toward sustainable AI, edge intelligence, and future architectures inspired by the adaptability of biological brains.
Abstract（参考訳）: ディープニューラルネットワーク、そして最近では、大規模言語モデル(LLM)や大規模ビジョンアクションモデル(LVA)のような大規模生成モデルが、さまざまな領域にわたって顕著なパフォーマンスを実現しているが、その禁忌な計算コストは、リソースに制約のある環境へのデプロイを妨げる。ドロップアウトはトレーニング中の正規化を改善するが、推論は変わらないが、プルーニングや低ランク因数分解圧縮モデルは適応性に制限のある静的な形式に後押しされる。ここで、SWAN(Switchable Activation Networks)というフレームワークを紹介します。これは、各ニューラルネットワークユニットに決定論的、入力依存のバイナリゲートを装備するフレームワークで、ユニットのアクティブ化や非アクティブ化をネットワークが学べます。この動的制御機構は計算を適応的に割り当て、精度を維持しながら冗長性を低減させる。従来のプルーニングとは異なり、SWANはトレーニング後に単にネットワークを縮小するのではなく、効率的な動的推論とデプロイのためのコンパクトな高密度モデルへの変換をサポートする構造化されたコンテキスト依存のアクティベーションパターンを学習する。 SWANは、効率を学習活性化制御の問題として再定義することにより、1つのパラダイムでスパーシティ、プルーニング、適応推論の強みを統一する。この視点は、計算のゲイン以外にも、活性化は固定ではなく、文脈に依存し、持続可能なAI、エッジインテリジェンス、そして生物学的脳の適応性にインスパイアされた将来のアーキテクチャを指して、より一般的なニューラルネットワークの原則を示唆している。

論文の概要: Switchable Activation Networks

関連論文リスト