Fugu-MT 論文翻訳(概要): Approximate Gaussianity Beyond Initialisation in Neural Networks

論文の概要: Approximate Gaussianity Beyond Initialisation in Neural Networks

arxiv url: http://arxiv.org/abs/2510.05218v1
Date: Mon, 06 Oct 2025 18:00:46 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-08 17:57:07.924157
Title: Approximate Gaussianity Beyond Initialisation in Neural Networks
Title（参考訳）: ニューラルネットワークの初期化を超える近似ガウス性
Authors: Edward Hirst, Sanjaye Ramgoolam,
Abstract要約: MNIST分類問題に対するトレーニングプロセスを通して,ニューラルネットワークの重み行列のアンサンブルについて検討する。実験には、様々な初期化機構、規則化、層深さ、層幅の影響が含まれる。
参考スコア（独自算出の注目度）: 0.0954904463032233
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Ensembles of neural network weight matrices are studied through the training process for the MNIST classification problem, testing the efficacy of matrix models for representing their distributions, under assumptions of Gaussianity and permutation-symmetry. The general 13-parameter permutation invariant Gaussian matrix models are found to be effective models for the correlated Gaussianity in the weight matrices, beyond the range of applicability of the simple Gaussian with independent identically distributed matrix variables, and notably well beyond the initialisation step. The representation theoretic model parameters, and the graph-theoretic characterisation of the permutation invariant matrix observables give an interpretable framework for the best-fit model and for small departures from Gaussianity. Additionally, the Wasserstein distance is calculated for this class of models and used to quantify the movement of the distributions over training. Throughout the work, the effects of varied initialisation regimes, regularisation, layer depth, and layer width are tested for this formalism, identifying limits where particular departures from Gaussianity are enhanced and how more general, yet still highly-interpretable, models can be developed.
Abstract（参考訳）: ニューラルネットワークの重み行列のアンサンブルは、MNIST分類問題のトレーニングプロセスを通じて、ガウス性および置換対称性の仮定の下で、それらの分布を表す行列モデルの有効性をテストする。一般の 13-パラメータ置換不変ガウス行列モデルは、ウェイト行列における相関ガウス性(英語版)の有効モデルであり、独立に均等に分布する行列変数を持つ単純ガウス行列の適用範囲を超えて、特に初期化段階を越えている。表現論的モデルパラメータと置換不変行列可観測体のグラフ論的特徴付けは、最良のモデルとガウス性からのわずかな離脱のための解釈可能な枠組みを与える。さらに、このモデルのクラスについてワッサーシュタイン距離を計算し、トレーニング中の分布の運動を定量化するために使用される。作業を通して、様々な初期化体制、規則化、層深度、層幅の影響をこの形式主義のために検証し、ガウシアン性からの特定の離脱が拡張される限界と、より一般的なモデルがいかに高度に解釈可能であるかを特定する。

論文の概要: Approximate Gaussianity Beyond Initialisation in Neural Networks

関連論文リスト