Fugu-MT 論文翻訳(概要): The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models

論文の概要: The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models

arxiv url: http://arxiv.org/abs/2006.14450v1
Date: Thu, 25 Jun 2020 14:41:53 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-17 03:39:58.809894
Title: The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models
Title（参考訳）: 2層ニューラルネットワークモデルにおけるグラディエントDescent Dynamicsの焼入れ活性化挙動
Authors: Chao Ma, Lei Wu, Weinan E
Abstract要約: 2層ニューラルネットワークモデルのトレーニングのための勾配降下アルゴリズムについて検討した。低パラメトリケート状態におけるGDの動的挙動の2つの相について検討した。クエンチング・アクティベーションプロセスは「単純正則化」の明確なメカニズムを提供するようである
参考スコア（独自算出の注目度）: 12.865834066050427
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A numerical and phenomenological study of the gradient descent (GD) algorithm for training two-layer neural network models is carried out for different parameter regimes when the target function can be accurately approximated by a relatively small number of neurons. It is found that for Xavier-like initialization, there are two distinctive phases in the dynamic behavior of GD in the under-parametrized regime: An early phase in which the GD dynamics follows closely that of the corresponding random feature model and the neurons are effectively quenched, followed by a late phase in which the neurons are divided into two groups: a group of a few "activated" neurons that dominate the dynamics and a group of background (or "quenched") neurons that support the continued activation and deactivation process. This neural network-like behavior is continued into the mildly over-parametrized regime, where it undergoes a transition to a random feature-like behavior. The quenching-activation process seems to provide a clear mechanism for "implicit regularization". This is qualitatively different from the dynamics associated with the "mean-field" scaling where all neurons participate equally and there does not appear to be qualitative changes when the network parameters are changed.
Abstract（参考訳）: 比較的少数のニューロンで目標関数を正確に近似できる場合, 2層ニューラルネットワークモデルを訓練するための勾配降下(GD)アルゴリズムの数値的および現象論的研究を行う。 It is found that for Xavier-like initialization, there are two distinctive phases in the dynamic behavior of GD in the under-parametrized regime: An early phase in which the GD dynamics follows closely that of the corresponding random feature model and the neurons are effectively quenched, followed by a late phase in which the neurons are divided into two groups: a group of a few "activated" neurons that dominate the dynamics and a group of background (or "quenched") neurons that support the continued activation and deactivation process. このニューラルネットワークのような振る舞いは、ランダムな特徴のような振る舞いに遷移する、軽度にパラメトリ化されたレジームに継続される。焼成活性化プロセスは「単純正則化」の明確なメカニズムを提供するようである。これは、全てのニューロンが等しく参加し、ネットワークパラメータが変更されたときに定性的変化がないという「平均場」スケーリングに関連する力学と定性的に異なる。

関連論文リスト

Pendulum Model of Spiking Neurons [0.0]
そこで本研究では, 減衰・駆動振子の動力学に基づく生物学的刺激によるスパイキングニューロンのモデルを提案する。本稿では,Spyke-Timing Dependent Plasticity (STDP)学習ルールにより,単一ニューロンの動的解析を行い,モデルをマルチニューロン層に拡張する。
論文参考訳（メタデータ） (2025-07-29T18:21:51Z)
Langevin Flows for Modeling Neural Latent Dynamics [81.81271685018284]
逐次変分自動エンコーダであるLangevinFlowを導入し、潜伏変数の時間的進化をアンダーダム化したLangevin方程式で制御する。われわれのアプローチは、慣性、減衰、学習されたポテンシャル関数、力などの物理的事前を組み込んで、ニューラルネットワークにおける自律的および非自律的プロセスの両方を表現する。本手法は,ロレンツ誘引器によって生成される合成神経集団に対する最先端のベースラインより優れる。
論文参考訳（メタデータ） (2025-07-15T17:57:48Z)
Allostatic Control of Persistent States in Spiking Neural Networks for perception and computation [79.16635054977068]
本稿では,アロスタシスの概念を内部表現の制御に拡張することにより,環境に対する知覚的信念を更新するための新しいモデルを提案する。本稿では,アトラクタネットワークにおける活動の急増を空間的数値表現として利用する数値認識の応用に焦点を当てる。
論文参考訳（メタデータ） (2025-03-20T12:28:08Z)
Neuronal and structural differentiation in the emergence of abstract rules in hierarchically modulated spiking neural networks [20.58066918526133]
ルール抽象化の根底にある内部的な機構は、いまだ解明されていない。この研究は、階層的に変調された繰り返しスパイクニューラルネットワーク(HM-RSNN)を導入し、本質的な神経特性をチューニングできる。我々は,HM-RSNNを用いた4つの認知課題のモデリングを行い,ネットワークレベルとニューロンレベルの両方で規則抽象化の相違が観察された。
論文参考訳（メタデータ） (2025-01-24T14:45:03Z)
Reconstruction of neuromorphic dynamics from a single scalar time series using variational autoencoder and neural network map [0.0]
ホジキン・ハクスリー形式に基づく生理ニューロンのモデルを考える。その変数の1つの時系列は、離散時間力学系として動作可能なニューラルネットワークをトレーニングするのに十分なものであることが示されている。
論文参考訳（メタデータ） (2024-11-11T15:15:55Z)
Confidence Regulation Neurons in Language Models [91.90337752432075]
本研究では,大規模言語モデルが次世代の予測において不確実性を表現・規制するメカニズムについて検討する。エントロピーニューロンは異常に高い重量ノルムを特徴とし、最終層正規化(LayerNorm)スケールに影響を与え、ロジットを効果的にスケールダウンさせる。ここで初めて説明するトークン周波数ニューロンは、各トークンのログをそのログ周波数に比例して増加または抑制することで、出力分布をユニグラム分布から遠ざかる。
論文参考訳（メタデータ） (2024-06-24T01:31:03Z)
Connecting NTK and NNGP: A Unified Theoretical Framework for Wide Neural Network Learning Dynamics [6.349503549199403]
我々は、ディープ・ワイド・ニューラルネットワークの学習プロセスのための包括的なフレームワークを提供する。拡散相を特徴づけることで、私たちの研究は脳内の表現的ドリフトに光を当てます。
論文参考訳（メタデータ） (2023-09-08T18:00:01Z)
STNDT: Modeling Neural Population Activity with a Spatiotemporal Transformer [19.329190789275565]
我々は、個々のニューロンの応答を明示的にモデル化するNDTベースのアーキテクチャであるSpatioTemporal Neural Data Transformer (STNDT)を紹介する。本モデルは,4つのニューラルデータセット間での神経活動の推定において,アンサンブルレベルでの最先端性能を実現することを示す。
論文参考訳（メタデータ） (2022-06-09T18:54:23Z)
Dynamic Neural Diversification: Path to Computationally Sustainable Neural Networks [68.8204255655161]
訓練可能なパラメータが制限された小さなニューラルネットワークは、多くの単純なタスクに対してリソース効率の高い候補となる。学習過程において隠れた層内のニューロンの多様性を探索する。ニューロンの多様性がモデルの予測にどのように影響するかを分析する。
論文参考訳（メタデータ） (2021-09-20T15:12:16Z)
Continuous Learning and Adaptation with Membrane Potential and Activation Threshold Homeostasis [91.3755431537592]
本稿では,MPATH(Membrane Potential and Activation Threshold Homeostasis)ニューロンモデルを提案する。このモデルにより、ニューロンは入力が提示されたときに自動的に活性を調節することで動的平衡の形式を維持することができる。実験は、モデルがその入力から適応し、継続的に学習する能力を示す。
論文参考訳（メタデータ） (2021-04-22T04:01:32Z)
Going beyond p-convolutions to learn grayscale morphological operators [64.38361575778237]
p-畳み込み層と同じ原理に基づく2つの新しい形態層を提示する。本研究では, p-畳み込み層と同じ原理に基づく2つの新しい形態層を示す。
論文参考訳（メタデータ） (2021-02-19T17:22:16Z)
And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
ネットワークに十分な数のOR様ニューロンが存在すると、分類の脆さと敵の攻撃に対する脆弱性が増加する。そこで我々は,AND様ニューロンを定義し,ネットワーク内での割合を増大させる対策を提案する。 MNISTデータセットによる実験結果から,本手法はさらなる探索の方向として有望であることが示唆された。
論文参考訳（メタデータ） (2021-02-15T08:19:05Z)
Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
グラディエント・スターベーションは、タスクに関連する機能のサブセットのみをキャプチャすることで、クロスエントロピー損失を最小化するときに発生する。この研究は、ニューラルネットワークにおけるそのような特徴不均衡の出現に関する理論的説明を提供する。
論文参考訳（メタデータ） (2020-11-18T18:52:08Z)
Phase diagram for two-layer ReLU neural networks at infinite-width limit [6.380166265263755]
我々は、2層ReLUニューラルネットワークの位相図を無限幅極限で描画する。位相図の3つのレギュレーション、すなわち線形レギュレーション、臨界レギュレーション、凝縮レギュレーションを同定する。線形状態においては、NNトレーニングダイナミクスは指数的損失減衰を持つランダム特徴モデルとほぼ同様の線形である。凝縮状態において、能動ニューロンがいくつかの異なる向きで凝縮されていることを示す実験を通して、我々は実験を行う。
論文参考訳（メタデータ） (2020-07-15T06:04:35Z)
Unifying and generalizing models of neural dynamics during decision-making [27.46508483610472]
本稿では,意思決定作業中の神経活動モデリングのための統一フレームワークを提案する。このフレームワークは標準ドリフト拡散モデルを含み、多次元アキュミュレータ、可変および崩壊境界、離散ジャンプなどの拡張を可能にする。
論文参考訳（メタデータ） (2020-01-13T23:57:28Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。