Fugu-MT 論文翻訳(概要): Normalized gradient flow optimization in the training of ReLU artificial neural networks

論文の概要: Normalized gradient flow optimization in the training of ReLU artificial neural networks

arxiv url: http://arxiv.org/abs/2207.06246v1
Date: Wed, 13 Jul 2022 14:44:46 GMT
ステータス: 翻訳完了
システム内更新日: 2022-07-14 16:10:34.227350
Title: Normalized gradient flow optimization in the training of ReLU artificial neural networks
Title（参考訳）: ReLU人工ニューラルネットワークのトレーニングにおける正規化勾配流最適化
Authors: Simon Eberle, Arnulf Jentzen, Adrian Riekert, Georg Weiss
Abstract要約: 本稿では,ReLU ANNの標準トレーニング手順の修正版を提案し,解析する。すべてのリプシッツ連続目標関数に対して、ANNパラメータ空間のこの大きな部分多様体上のすべての勾配流軌跡が大域的に有界であることを証明する。
参考スコア（独自算出の注目度）: 1.873444918172383
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The training of artificial neural networks (ANNs) is nowadays a highly relevant algorithmic procedure with many applications in science and industry. Roughly speaking, ANNs can be regarded as iterated compositions between affine linear functions and certain fixed nonlinear functions, which are usually multidimensional versions of a one-dimensional so-called activation function. The most popular choice of such a one-dimensional activation function is the rectified linear unit (ReLU) activation function which maps a real number to its positive part $ \mathbb{R} \ni x \mapsto \max\{ x, 0 \} \in \mathbb{R} $. In this article we propose and analyze a modified variant of the standard training procedure of such ReLU ANNs in the sense that we propose to restrict the negative gradient flow dynamics to a large submanifold of the ANN parameter space, which is a strict $ C^{ \infty } $-submanifold of the entire ANN parameter space that seems to enjoy better regularity properties than the entire ANN parameter space but which is also sufficiently large and sufficiently high dimensional so that it can represent all ANN realization functions that can be represented through the entire ANN parameter space. In the special situation of shallow ANNs with just one-dimensional ANN layers we also prove for every Lipschitz continuous target function that every gradient flow trajectory on this large submanifold of the ANN parameter space is globally bounded. For the standard gradient flow on the entire ANN parameter space with Lipschitz continuous target functions it remains an open problem of research to prove or disprove the global boundedness of gradient flow trajectories even in the situation of shallow ANNs with just one-dimensional ANN layers.
Abstract（参考訳）: ニューラルネットワーク(anns)のトレーニングは、科学や産業の多くの応用において、現在非常に関連性の高いアルゴリズム手順である。概して、ANNは1次元のいわゆるアクティベーション関数の多次元バージョンであるアフィン線型関数とある種の固定非線形関数の反復合成と見なすことができる。そのような一次元の活性化関数の最も一般的な選択は、実数を正の部分 $ \mathbb{R} \ni x \mapsto \max\{ x, 0 \} \in \mathbb{R} $ に写す正線形単位(ReLU)活性化関数である。 In this article we propose and analyze a modified variant of the standard training procedure of such ReLU ANNs in the sense that we propose to restrict the negative gradient flow dynamics to a large submanifold of the ANN parameter space, which is a strict $ C^{ \infty } $-submanifold of the entire ANN parameter space that seems to enjoy better regularity properties than the entire ANN parameter space but which is also sufficiently large and sufficiently high dimensional so that it can represent all ANN realization functions that can be represented through the entire ANN parameter space. 1次元のANN層しか持たない浅層ANNの特別な状況では、ANNパラメータ空間のこの大きな部分多様体上のすべての勾配流軌跡が全世界的に有界であることも証明する。リプシッツ連続目標関数を持つANNパラメータ空間全体の標準勾配流については、わずか1次元のANN層を持つ浅層ANNであっても勾配流路のグローバルな境界性を証明するか、証明する研究のオープンな問題である。

関連論文リスト

Mathematical analysis of the gradients in deep learning [3.3123773366516645]
勾配関数は、コスト汎函数が連続的に微分可能なすべての開集合上のコスト汎函数の標準勾配と一致しなければならないことを示す。一般化された勾配函数は、コスト汎函数が連続的に微分可能なすべての開集合上のコスト汎函数の標準勾配と一致しなければならない。
論文参考訳（メタデータ） (2025-01-26T19:11:57Z)
A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
本稿では,超パラメトリック化された2層ニューラルネットワークの無限次元関数クラス上で定義される最小最適化問題について検討する。 i) 勾配降下指数アルゴリズムの収束と, (ii) ニューラルネットワークの表現学習に対処する。その結果、ニューラルネットワークによって誘導される特徴表現は、ワッサーシュタイン距離で測定された$O(alpha-1)$で初期表現から逸脱することが許された。
論文参考訳（メタデータ） (2024-04-18T16:46:08Z)
Multi-Grid Tensorized Fourier Neural Operator for High-Resolution PDEs [93.82811501035569]
本稿では,メモリ要求を低減し,より一般化したデータ効率・並列化可能な演算子学習手法を提案する。 MG-TFNOは、実世界の実世界の現象の局所的構造と大域的構造を活用することで、大規模な分解能にスケールする。乱流ナビエ・ストークス方程式において150倍以上の圧縮で誤差の半分以下を達成できる優れた性能を示す。
論文参考訳（メタデータ） (2023-09-29T20:18:52Z)
The necessity of depth for artificial neural networks to approximate certain classes of smooth and bounded functions without the curse of dimensionality [4.425982186154401]
直列線形ユニット(ReLU)を活性化した浅部および深部ニューラルネットワーク(ANN)の高次元近似能力について検討した。特に、この研究の重要な貢献は、すべての$a,binmathbbR$に対して$b-ageq 7$に対して、 $[a,b]dni x=(x_1,dots,x_d)mapstoprod_i=1d x_iinmathbbR$ for $d という関数があることを明らかにすることである。
論文参考訳（メタデータ） (2023-01-19T19:52:41Z)
On Feature Learning in Neural Networks with Global Convergence Guarantees [49.870593940818715]
勾配流(GF)を用いた広帯域ニューラルネットワーク(NN)の最適化について検討する。入力次元がトレーニングセットのサイズ以下である場合、トレーニング損失はGFの下での線形速度で0に収束することを示す。また、ニューラル・タンジェント・カーネル(NTK)システムとは異なり、我々の多層モデルは特徴学習を示し、NTKモデルよりも優れた一般化性能が得られることを実証的に示す。
論文参考訳（メタデータ） (2022-04-22T15:56:43Z)
Graph-adaptive Rectified Linear Unit for Graph Neural Networks [64.92221119723048]
グラフニューラルネットワーク(GNN)は、従来の畳み込みを非ユークリッドデータでの学習に拡張することで、目覚ましい成功を収めた。本稿では,周辺情報を利用した新しいパラメトリックアクティベーション機能であるグラフ適応整流線形ユニット(GRELU)を提案する。我々は,GNNのバックボーンと様々な下流タスクによって,プラグアンドプレイGRELU法が効率的かつ効果的であることを示す包括的実験を行った。
論文参考訳（メタデータ） (2022-02-13T10:54:59Z)
A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions [3.198144010381572]
勾配降下(GD)型最適化法は、ニューラルネットワーク(ANN)を修正線形単位(ReLU)アクティベーションで訓練する標準的な手法である。
論文参考訳（メタデータ） (2021-08-10T12:01:37Z)
OGGN: A Novel Generalized Oracle Guided Generative Architecture for Modelling Inverse Function of Artificial Neural Networks [0.6091702876917279]
本稿では,ANN(Artificial Neural Network)の逆関数を,完全にあるいは部分的にモデル化するための新しい生成ニューラルネットワークアーキテクチャを提案する。 OGGNと呼ばれる提案されたOracle Guided Generative Neural Networkは、さまざまな機能生成問題に柔軟に対応します。この制約関数により、ニューラルネットワークは与えられた局所空間を長時間調査することができる。
論文参考訳（メタデータ） (2021-04-08T17:28:52Z)
dNNsolve: an efficient NN-based PDE solver [62.997667081978825]
ODE/PDEを解決するためにデュアルニューラルネットワークを利用するdNNsolveを紹介します。我々は,dNNsolveが1,2,3次元の幅広いODE/PDEを解くことができることを示す。
論文参考訳（メタデータ） (2021-03-15T19:14:41Z)
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions [3.4792548480344254]
勾配降下法のリスク関数は, 実際に0に収束することを示す。この作業の重要な貢献は、ANNパラメータの勾配フローシステムのLyapunov関数を明示的に指定することです。
論文参考訳（メタデータ） (2021-02-19T13:33:03Z)
On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics [8.160343645537106]
有限深さ$L$および無限幅のReLUニューラルネットワークに対するバナッハ空間を開発する。空間はすべての有限完全連結な$L$-層ネットワークと、それらの$L2$-極限オブジェクトを自然経路ノルムの下に含む。このノルムの下では、$L$層ネットワークの空間内の単位球は、ラデマッハの複雑さが低く、したがって好ましい性質を持つ。
論文参考訳（メタデータ） (2020-07-30T17:47:05Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
本稿では、オーバーパラメータ化ディープニューラルネットワーク(DNN)のための新しい平均場フレームワークを提案する。このフレームワークでは、DNNは連続的な極限におけるその特徴に対する確率測度と関数によって表現される。本稿では、標準DNNとResidual Network(Res-Net)アーキテクチャを通してフレームワークを説明する。
論文参考訳（メタデータ） (2020-07-03T01:37:16Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。