論文の概要: Piecewise-Linear Activations or Analytic Activation Functions: Which
Produce More Expressive Neural Networks?
- arxiv url: http://arxiv.org/abs/2204.11231v1
- Date: Sun, 24 Apr 2022 09:53:39 GMT
- ステータス: 処理完了
- システム内更新日: 2022-04-26 13:41:06.032547
- Title: Piecewise-Linear Activations or Analytic Activation Functions: Which
Produce More Expressive Neural Networks?
- Title(参考訳): ピアワイズ・リニア・アクティベーションと解析的アクティベーション関数:より表現力のあるニューラルネットを創るか?
- Authors: Anastasis Kratsios and Behnoosh Zamanlooy
- Abstract要約: 我々は、ReLUネットワークが古典的な(例えばシグモダル)ネットワークよりも優れていることを示す。
我々の主な結果は、$operatornameNNomega+operatornamePool$ のネットワークと $operatornameNNomega+operatornamePool$ のネットワーク間の「分離現象」が(普遍的ではない)密接でないことを示すテキスト定量的に説明されている。
- 参考スコア(独自算出の注目度): 4.18804572788063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many currently available universal approximation theorems affirm that deep
feedforward networks defined using any suitable activation function can
approximate any integrable function locally in $L^1$-norm. Though different
approximation rates are available for deep neural networks defined using other
classes of activation functions, there is little explanation for the
empirically confirmed advantage that ReLU networks exhibit over their classical
(e.g. sigmoidal) counterparts. Our main result demonstrates that deep networks
with piecewise linear activation (e.g. ReLU or PReLU) are fundamentally more
expressive than deep feedforward networks with analytic (e.g. sigmoid, Swish,
GeLU, or Softplus). More specifically, we construct a strict refinement of the
topology on the space $L^1_{\operatorname{loc}}(\mathbb{R}^d,\mathbb{R}^D)$ of
locally Lebesgue-integrable functions, in which the set of deep ReLU networks
with (bilinear) pooling $\operatorname{NN}^{\operatorname{ReLU} +
\operatorname{Pool}}$ is dense (i.e. universal) but the set of deep feedforward
networks defined using any combination of analytic activation functions with
(or without) pooling layers $\operatorname{NN}^{\omega+\operatorname{Pool}}$ is
not dense (i.e. not universal). Our main result is further explained by
\textit{quantitatively} demonstrating that this "separation phenomenon" between
the networks in $\operatorname{NN}^{\operatorname{ReLU}+\operatorname{Pool}}$
and those in $\operatorname{NN}^{\omega+\operatorname{Pool}}$ by showing that
the networks in $\operatorname{NN}^{\operatorname{ReLU}}$ are capable of
approximate any compactly supported Lipschitz function while
\textit{simultaneously} approximating its essential support; whereas, the
networks in $\operatorname{NN}^{\omega+\operatorname{pool}}$ cannot.
- Abstract(参考訳): 現在利用可能な普遍近似定理の多くは、任意の適切な活性化関数を用いて定義された深いフィードフォワードネットワークが、$L^1$-ノルムの任意の可積分函数を局所的に近似することができることを証明している。
More specifically, we construct a strict refinement of the topology on the space $L^1_{\operatorname{loc}}(\mathbb{R}^d,\mathbb{R}^D)$ of locally Lebesgue-integrable functions, in which the set of deep ReLU networks with (bilinear) pooling $\operatorname{NN}^{\operatorname{ReLU} + \operatorname{Pool}}$ is dense (i.e. universal) but the set of deep feedforward networks defined using any combination of analytic activation functions with (or without) pooling layers $\operatorname{NN}^{\omega+\operatorname{Pool}}$ is not dense (i.e. not universal).
Our main result is further explained by \textit{quantitatively} demonstrating that this "separation phenomenon" between the networks in $\operatorname{NN}^{\operatorname{ReLU}+\operatorname{Pool}}$ and those in $\operatorname{NN}^{\omega+\operatorname{Pool}}$ by showing that the networks in $\operatorname{NN}^{\operatorname{ReLU}}$ are capable of approximate any compactly supported Lipschitz function while \textit{simultaneously} approximating its essential support; whereas, the networks in $\operatorname{NN}^{\omega+\operatorname{pool}}$ cannot.
- New advances in universal approximation with neural networks of minimal width [4.424170214926035]
論文 参考訳(メタデータ) (2024-11-13T16:17:16Z) - Expressivity and Approximation Properties of Deep Neural Networks with
ReLU$^k$ Activation [2.3020018305241337]
本稿では、ReLU$k$Activation Function for $k geq 2$を用いたディープネットワークの表現性と近似特性について検討する。
ディープ ReLU$k$ ネットワークは効率的に近似できるが、ディープ ReLU$k$ ネットワークは高次を正確に表現することができる。
論文 参考訳(メタデータ) (2023-12-27T09:11:14Z) - Learning Hierarchical Polynomials with Three-Layer Neural Networks [56.71223169861528]
論文 参考訳(メタデータ) (2023-11-23T02:19:32Z) - Minimal Width for Universal Property of Deep RNN [6.744583770038476]
リカレントニューラルネットワーク(Recurrent Neural Network, RNN)は、シーケンシャルデータを扱うために広く使われているディープラーニングネットワークである。
我々は, 深部狭いRNNの普遍性を証明し, 最大幅の上限がデータ長に依存しないことを示す。
論文 参考訳(メタデータ) (2022-11-25T02:43:54Z) - Achieve the Minimum Width of Neural Networks for Universal Approximation [1.52292571922932]
論文 参考訳(メタデータ) (2022-09-23T04:03:50Z) - Robust Training and Verification of Implicit Neural Networks: A
Non-Euclidean Contractive Approach [64.23331120621118]
論文 参考訳(メタデータ) (2022-08-08T03:13:24Z) - Expressive power of binary and ternary neural networks [91.3755431537592]
論文 参考訳(メタデータ) (2022-06-27T13:16:08Z) - Deep neural network approximation of analytic functions [91.3755431537592]
ニューラルネットワークの空間に エントロピーバウンド 片方向の線形活性化関数を持つ
論文 参考訳(メタデータ) (2021-04-05T18:02:04Z) - Size and Depth Separation in Approximating Natural Functions with Neural
Networks [52.73592689730044]
論文 参考訳(メタデータ) (2021-01-30T21:30:11Z) - Nonclosedness of Sets of Neural Networks in Sobolev Spaces [0.0]
実現されたニューラルネットワークは順序で閉じていないことを示す--(m-1)$ソボレフ空間$Wm-1,p$ for $p in [1,infty]$。
論文 参考訳(メタデータ) (2020-07-23T00:57:25Z) - Interval Universal Approximation for Neural Networks [47.767793120249095]
論文 参考訳(メタデータ) (2020-07-12T20:43:56Z)