Fugu-MT 論文翻訳(概要): DynamicGate MLP Conditional Computation via Learned Structural Dropout and Input Dependent Gating for Functional Plasticity

論文の概要: DynamicGate MLP Conditional Computation via Learned Structural Dropout and Input Dependent Gating for Functional Plasticity

arxiv url: http://arxiv.org/abs/2603.16367v1
Date: Tue, 17 Mar 2026 10:54:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-18 17:42:07.230687
Title: DynamicGate MLP Conditional Computation via Learned Structural Dropout and Input Dependent Gating for Functional Plasticity
Title（参考訳）: 機能的塑性に対する動的Gate MLP条件計算と入力依存ゲーティング
Authors: Yong Il Choi,
Abstract要約: ドロップアウト(Dropout)は、トレーニング中に隠れたユニットを非活性化してオーバーフィッティングを緩和する代表的な正規化技術である。標準推論は、高密度な計算で全ネットワークを実行するため、その目標とメカニズムは条件付き計算とは異なる。本稿では,DynamicGate-MLPを正規化ビューと条件計算ビューの両方を同時に満足する単一のフレームワークに編成する。
参考スコア（独自算出の注目度）: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dropout is a representative regularization technique that stochastically deactivates hidden units during training to mitigate overfitting. In contrast, standard inference executes the full network with dense computation, so its goal and mechanism differ from conditional computation, where the executed operations depend on the input. This paper organizes DynamicGate-MLP into a single framework that simultaneously satisfies both the regularization view and the conditional-computation view. Instead of a random mask, the proposed model learns gates that decide whether to use each unit (or block), suppressing unnecessary computation while implementing sample-dependent execution that concentrates computation on the parts needed for each input. To this end, we define continuous gate probabilities and, at inference time, generate a discrete execution mask from them to select an execution path. Training controls the compute budget via a penalty on expected gate usage and uses a Straight-Through Estimator (STE) to optimize the discrete mask. We evaluate DynamicGate-MLP on MNIST, CIFAR-10, Tiny-ImageNet, Speech Commands, and PBMC3k, and compare it with various MLP baselines and MoE-style variants. Compute efficiency is compared under a consistent criterion using gate activation ratios and a layerweighted relative MAC metric, rather than wall-clock latency that depends on hardware and backend kernels.
Abstract（参考訳）: ドロップアウト(Dropout)は、トレーニング中に隠れたユニットを確率的に非活性化してオーバーフィッティングを緩和する代表的な正規化技術である。対照的に、標準推論は、高密度な計算で全ネットワークを実行するため、その目的とメカニズムは、実行された操作が入力に依存する条件計算とは異なる。本稿では,DynamicGate-MLPを正規化ビューと条件計算ビューの両方を同時に満足する単一のフレームワークに編成する。ランダムマスクの代わりに、提案モデルは、各ユニット(またはブロック)を使用するかどうかを決定するゲートを学習し、各入力に必要な部分に集中するサンプル依存実行を実装しながら、不要な計算を抑える。この目的のために、連続ゲート確率を定義し、推論時に個別の実行マスクを生成して実行経路を選択する。トレーニングは、期待されるゲートの使用に対するペナルティを通じて計算予算を制御し、離散マスクを最適化するためにSTE(Straight-Through Estimator)を使用する。 MNIST, CIFAR-10, Tiny-ImageNet, Speech Commands, PBMC3k 上で DynamicGate-MLP を評価し,様々な MLP ベースラインや MoE スタイルの変種と比較した。計算効率は、ハードウェアやバックエンドのカーネルに依存するウォールクロックレイテンシよりも、ゲートアクティベーション比と層重の相対MACメトリックを使用して一貫した基準で比較される。

論文の概要: DynamicGate MLP Conditional Computation via Learned Structural Dropout and Input Dependent Gating for Functional Plasticity

関連論文リスト