Embeddings between Barron spaces with higher order activation functions
- URL: http://arxiv.org/abs/2305.15839v2
- Date: Tue, 18 Jun 2024 08:33:24 GMT
- Title: Embeddings between Barron spaces with higher order activation functions
- Authors: Tjeerd Jan Heeringa, Len Spek, Felix Schwenninger, Christoph Brune,
- Abstract summary: We study embeddings between Barron spaces with different activation functions.
An activation function of particular interest is the rectified power unit ($operatornameRePU$) given by $operatornameRePU_s(x)=max(0,x)s$.
- Score: 1.0999592665107414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The approximation properties of infinitely wide shallow neural networks heavily depend on the choice of the activation function. To understand this influence, we study embeddings between Barron spaces with different activation functions. These embeddings are proven by providing push-forward maps on the measures $\mu$ used to represent functions $f$. An activation function of particular interest is the rectified power unit ($\operatorname{RePU}$) given by $\operatorname{RePU}_s(x)=\max(0,x)^s$. For many commonly used activation functions, the well-known Taylor remainder theorem can be used to construct a push-forward map, which allows us to prove the embedding of the associated Barron space into a Barron space with a $\operatorname{RePU}$ as activation function. Moreover, the Barron spaces associated with the $\operatorname{RePU}_s$ have a hierarchical structure similar to the Sobolev spaces $H^m$.
Related papers
- Shift-invariant functions and almost liftings [0.0]
We investigate shift-invariant vectorial Boolean functions on $n$bits that are lifted from Boolean functions on $k$bits, for $kleq n$.
We show that if a Boolean function with diameter $k$ is an almost lifting, the maximum number of collisions of its lifted functions is $2k-1$ for any $n$.
We search for functions in the class of almost liftings that have good cryptographic properties and for which the non-bijectivity does not cause major security weaknesses.
arXiv Detail & Related papers (2024-07-16T17:23:27Z) - Representing Piecewise-Linear Functions by Functions with Minimal Arity [0.5266869303483376]
We show that the tessellation of the input space $mathbbRn$ induced by the function $F$ has a direct connection to the number of arguments in the $max$ functions.
arXiv Detail & Related papers (2024-06-04T15:39:08Z) - On dimensionality of feature vectors in MPNNs [49.32130498861987]
We revisit the classical result of Morris et al.(AAAI'19) that message-passing graphs neural networks (MPNNs) are equal in their distinguishing power to the Weisfeiler--Leman (WL) isomorphism test.
arXiv Detail & Related papers (2024-02-06T12:56:55Z) - 1-Lipschitz Neural Networks are more expressive with N-Activations [19.858602457988194]
Small changes to a system's inputs should not result in large changes to its outputs.
We show that commonly used activation functions, such as MaxMin, unnecessarily restrict the class of representable functions.
We introduce the new N-activation function that is provably more expressive than currently popular activation functions.
arXiv Detail & Related papers (2023-11-10T15:12:04Z) - Polynomial Width is Sufficient for Set Representation with
High-dimensional Features [69.65698500919869]
DeepSets is the most widely used neural network architecture for set representation.
We present two set-element embedding layers: (a) linear + power activation (LP) and (b) linear + exponential activations (LE)
arXiv Detail & Related papers (2023-07-08T16:00:59Z) - The Sample Complexity of Online Contract Design [120.9833763323407]
We study the hidden-action principal-agent problem in an online setting.
In each round, the principal posts a contract that specifies the payment to the agent based on each outcome.
The agent then makes a strategic choice of action that maximizes her own utility, but the action is not directly observable by the principal.
arXiv Detail & Related papers (2022-11-10T17:59:42Z) - Exponential Separation between Quantum and Classical Ordered Binary
Decision Diagrams, Reordering Method and Hierarchies [68.93512627479197]
We study quantum Ordered Binary Decision Diagrams($OBDD$) model.
We prove lower bounds and upper bounds for OBDD with arbitrary order of input variables.
We extend hierarchy for read$k$-times Ordered Binary Decision Diagrams ($k$-OBDD$) of width.
arXiv Detail & Related papers (2022-04-22T12:37:56Z) - Logical Activation Functions: Logit-space equivalents of Boolean
Operators [4.577830474623795]
We introduce an efficient approximation named $textAND_textAIL$, which can be deployed as an activation function in neural networks.
We demonstrate their effectiveness on a variety of tasks including image classification, transfer learning, abstract reasoning, and compositional zero-shot learning.
arXiv Detail & Related papers (2021-10-22T17:49:42Z) - Neural networks with superexpressive activations and integer weights [91.3755431537592]
An example of an activation function $sigma$ is given such that networks with activations $sigma, lfloorcdotrfloor$, integer weights and a fixed architecture is given.
The range of integer weights required for $varepsilon$-approximation of H"older continuous functions is derived.
arXiv Detail & Related papers (2021-05-20T17:29:08Z) - Representation formulas and pointwise properties for Barron functions [8.160343645537106]
We study the natural function space for infinitely wide two-layer neural networks with ReLU activation (Barron space)
We show that functions whose singular set is fractal or curved cannot be represented by infinitely wide two-layer networks with finite path-norm.
This result suggests that two-layer neural networks may be able to approximate a greater variety of functions than commonly believed.
arXiv Detail & Related papers (2020-06-10T17:55:31Z) - On the Modularity of Hypernetworks [103.1147622394852]
We show that for a structured target function, the overall number of trainable parameters in a hypernetwork is smaller by orders of magnitude than the number of trainable parameters of a standard neural network and an embedding method.
arXiv Detail & Related papers (2020-02-23T22:51:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.