論文の概要: Normalized gradient flow optimization in the training of ReLU artificial
neural networks
- arxiv url: http://arxiv.org/abs/2207.06246v1
- Date: Wed, 13 Jul 2022 14:44:46 GMT
- ステータス: 処理完了
- システム内更新日: 2022-07-14 16:10:34.227350
- Title: Normalized gradient flow optimization in the training of ReLU artificial
neural networks
- Title(参考訳): ReLU人工ニューラルネットワークのトレーニングにおける正規化勾配流最適化
- Authors: Simon Eberle, Arnulf Jentzen, Adrian Riekert, Georg Weiss
- Abstract要約: 本稿では,ReLU ANNの標準トレーニング手順の修正版を提案し,解析する。
- 参考スコア(独自算出の注目度): 1.873444918172383
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The training of artificial neural networks (ANNs) is nowadays a highly
relevant algorithmic procedure with many applications in science and industry.
Roughly speaking, ANNs can be regarded as iterated compositions between affine
linear functions and certain fixed nonlinear functions, which are usually
multidimensional versions of a one-dimensional so-called activation function.
The most popular choice of such a one-dimensional activation function is the
rectified linear unit (ReLU) activation function which maps a real number to
its positive part $ \mathbb{R} \ni x \mapsto \max\{ x, 0 \} \in \mathbb{R} $.
In this article we propose and analyze a modified variant of the standard
training procedure of such ReLU ANNs in the sense that we propose to restrict
the negative gradient flow dynamics to a large submanifold of the ANN parameter
space, which is a strict $ C^{ \infty } $-submanifold of the entire ANN
parameter space that seems to enjoy better regularity properties than the
entire ANN parameter space but which is also sufficiently large and
sufficiently high dimensional so that it can represent all ANN realization
functions that can be represented through the entire ANN parameter space. In
the special situation of shallow ANNs with just one-dimensional ANN layers we
also prove for every Lipschitz continuous target function that every gradient
flow trajectory on this large submanifold of the ANN parameter space is
globally bounded. For the standard gradient flow on the entire ANN parameter
space with Lipschitz continuous target functions it remains an open problem of
research to prove or disprove the global boundedness of gradient flow
trajectories even in the situation of shallow ANNs with just one-dimensional
ANN layers.
- Abstract(参考訳): ニューラルネットワーク(anns)のトレーニングは、科学や産業の多くの応用において、現在非常に関連性の高いアルゴリズム手順である。
そのような一次元の活性化関数の最も一般的な選択は、実数を正の部分 $ \mathbb{R} \ni x \mapsto \max\{ x, 0 \} \in \mathbb{R} $ に写す正線形単位(ReLU)活性化関数である。
In this article we propose and analyze a modified variant of the standard training procedure of such ReLU ANNs in the sense that we propose to restrict the negative gradient flow dynamics to a large submanifold of the ANN parameter space, which is a strict $ C^{ \infty } $-submanifold of the entire ANN parameter space that seems to enjoy better regularity properties than the entire ANN parameter space but which is also sufficiently large and sufficiently high dimensional so that it can represent all ANN realization functions that can be represented through the entire ANN parameter space.
- Mathematical analysis of the gradients in deep learning [3.3123773366516645]
論文 参考訳(メタデータ) (2025-01-26T19:11:57Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
i) 勾配降下指数アルゴリズムの収束と, (ii) ニューラルネットワークの表現学習に対処する。
論文 参考訳(メタデータ) (2024-04-18T16:46:08Z) - Multi-Grid Tensorized Fourier Neural Operator for High-Resolution PDEs [93.82811501035569]
論文 参考訳(メタデータ) (2023-09-29T20:18:52Z) - The necessity of depth for artificial neural networks to approximate
certain classes of smooth and bounded functions without the curse of
dimensionality [4.425982186154401]
特に、この研究の重要な貢献は、すべての$a,binmathbbR$に対して$b-ageq 7$に対して、 $[a,b]dni x=(x_1,dots,x_d)mapstoprod_i=1d x_iinmathbbR$ for $d という関数があることを明らかにすることである。
論文 参考訳(メタデータ) (2023-01-19T19:52:41Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
論文 参考訳(メタデータ) (2022-04-22T15:56:43Z) - Graph-adaptive Rectified Linear Unit for Graph Neural Networks [64.92221119723048]
論文 参考訳(メタデータ) (2022-02-13T10:54:59Z) - A proof of convergence for the gradient descent optimization method with
random initializations in the training of neural networks with ReLU
activation for piecewise linear target functions [3.198144010381572]
論文 参考訳(メタデータ) (2021-08-10T12:01:37Z) - dNNsolve: an efficient NN-based PDE solver [62.997667081978825]
論文 参考訳(メタデータ) (2021-03-15T19:14:41Z) - A proof of convergence for gradient descent in the training of
artificial neural networks for constant target functions [3.4792548480344254]
勾配降下法のリスク関数は, 実際に0に収束することを示す。
論文 参考訳(メタデータ) (2021-02-19T13:33:03Z) - On the Banach spaces associated with multi-layer ReLU networks: Function
representation, approximation theory and gradient descent dynamics [8.160343645537106]
論文 参考訳(メタデータ) (2020-07-30T17:47:05Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
本稿では、標準DNNとResidual Network(Res-Net)アーキテクチャを通してフレームワークを説明する。
論文 参考訳(メタデータ) (2020-07-03T01:37:16Z)