Logical Activation Functions: Logit-space equivalents of Boolean
Operators
- URL: http://arxiv.org/abs/2110.11940v1
- Date: Fri, 22 Oct 2021 17:49:42 GMT
- Title: Logical Activation Functions: Logit-space equivalents of Boolean
Operators
- Authors: Scott C. Lowe, Robert Earle, Jason d'Eon, Thomas Trappenberg, Sageev
Oore
- Abstract summary: We introduce an efficient approximation named $textAND_textAIL$, which can be deployed as an activation function in neural networks.
We demonstrate their effectiveness on a variety of tasks including image classification, transfer learning, abstract reasoning, and compositional zero-shot learning.
- Score: 4.577830474623795
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Neuronal representations within artificial neural networks are commonly
understood as logits, representing the log-odds score of presence (versus
absence) of features within the stimulus. Under this interpretation, we can
derive the probability $P(x_0 \land x_1)$ that a pair of independent features
are both present in the stimulus from their logits. By converting the resulting
probability back into a logit, we obtain a logit-space equivalent of the AND
operation. However, since this function involves taking multiple exponents and
logarithms, it is not well suited to be directly used within neural networks.
We thus constructed an efficient approximation named $\text{AND}_\text{AIL}$
(the AND operator Approximate for Independent Logits) utilizing only comparison
and addition operations, which can be deployed as an activation function in
neural networks. Like MaxOut, $\text{AND}_\text{AIL}$ is a generalization of
ReLU to two-dimensions. Additionally, we constructed efficient approximations
of the logit-space equivalents to the OR and XNOR operators. We deployed these
new activation functions, both in isolation and in conjunction, and
demonstrated their effectiveness on a variety of tasks including image
classification, transfer learning, abstract reasoning, and compositional
zero-shot learning.
Related papers
- Neural Operators with Localized Integral and Differential Kernels [77.76991758980003]
We present a principled approach to operator learning that can capture local features under two frameworks.
We prove that we obtain differential operators under an appropriate scaling of the kernel values of CNNs.
To obtain local integral operators, we utilize suitable basis representations for the kernels based on discrete-continuous convolutions.
arXiv Detail & Related papers (2024-02-26T18:59:31Z) - MgNO: Efficient Parameterization of Linear Operators via Multigrid [4.096453902709292]
We introduce MgNO, utilizing multigrid structures to parameterize linear operators between neurons.
MgNO exhibits superior ease of training compared to other CNN-based models.
arXiv Detail & Related papers (2023-10-16T13:01:35Z) - STL: A Signed and Truncated Logarithm Activation Function for Neural
Networks [5.9622541907827875]
Activation functions play an essential role in neural networks.
We present a novel signed and truncated logarithm function as activation function.
The suggested activation function can be applied in a large range of neural networks.
arXiv Detail & Related papers (2023-07-31T03:41:14Z) - Data-aware customization of activation functions reduces neural network
error [0.35172332086962865]
We show that data-aware customization of activation functions can result in striking reductions in neural network error.
A simple substitution with the seagull'' activation function in an already-refined neural network can lead to an order-of-magnitude reduction in error.
arXiv Detail & Related papers (2023-01-16T23:38:37Z) - Provable General Function Class Representation Learning in Multitask
Bandits and MDPs [58.624124220900306]
multitask representation learning is a popular approach in reinforcement learning to boost the sample efficiency.
In this work, we extend the analysis to general function class representations.
We theoretically validate the benefit of multitask representation learning within general function class for bandits and linear MDP.
arXiv Detail & Related papers (2022-05-31T11:36:42Z) - MIONet: Learning multiple-input operators via tensor product [2.5426761219054312]
We study the operator regression via neural networks for multiple-input operators defined on the product of Banach spaces.
Based on our theory and a low-rank approximation, we propose a novel neural operator, MIONet, to learn multiple-input operators.
arXiv Detail & Related papers (2022-02-12T20:37:04Z) - Neural Operator: Learning Maps Between Function Spaces [75.93843876663128]
We propose a generalization of neural networks to learn operators, termed neural operators, that map between infinite dimensional function spaces.
We prove a universal approximation theorem for our proposed neural operator, showing that it can approximate any given nonlinear continuous operator.
An important application for neural operators is learning surrogate maps for the solution operators of partial differential equations.
arXiv Detail & Related papers (2021-08-19T03:56:49Z) - Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z) - The Connection Between Approximation, Depth Separation and Learnability
in Neural Networks [70.55686685872008]
We study the connection between learnability and approximation capacity.
We show that learnability with deep networks of a target function depends on the ability of simpler classes to approximate the target.
arXiv Detail & Related papers (2021-01-31T11:32:30Z) - Estimating Multiplicative Relations in Neural Networks [0.0]
We will use properties of logarithmic functions to propose a pair of activation functions which can translate products into linear expression and learn using backpropagation.
We will try to generalize this approach for some complex arithmetic functions and test the accuracy on a disjoint distribution with the training set.
arXiv Detail & Related papers (2020-10-28T14:28:24Z) - Interval Universal Approximation for Neural Networks [47.767793120249095]
We introduce the interval universal approximation (IUA) theorem.
IUA shows that neural networks can approximate any continuous function $f$ as we have known for decades.
We study the computational complexity of constructing neural networks that are amenable to precise interval analysis.
arXiv Detail & Related papers (2020-07-12T20:43:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.