Swim: A General-Purpose, High-Performing, and Efficient Activation
Function for Locomotion Control Tasks
- URL: http://arxiv.org/abs/2303.02640v1
- Date: Sun, 5 Mar 2023 11:04:33 GMT
- Title: Swim: A General-Purpose, High-Performing, and Efficient Activation
Function for Locomotion Control Tasks
- Authors: Maryam Abdool and Tony Dear
- Abstract summary: Activation functions play a significant role in the performance of deep learning algorithms.
In particular, the Swish activation function tends to outperform ReLU on deeper models.
We propose Swim, a general-purpose, efficient, and high-performing alternative to Swish.
- Score: 0.2538209532048866
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Activation functions play a significant role in the performance of deep
learning algorithms. In particular, the Swish activation function tends to
outperform ReLU on deeper models, including deep reinforcement learning models,
across challenging tasks. Despite this progress, ReLU is the preferred function
partly because it is more efficient than Swish. Furthermore, in contrast to the
fields of computer vision and natural language processing, the deep
reinforcement learning and robotics domains have seen less inclination to adopt
new activation functions, such as Swish, and instead continue to use more
traditional functions, like ReLU. To tackle those issues, we propose Swim, a
general-purpose, efficient, and high-performing alternative to Swish, and then
provide an analysis of its properties as well as an explanation for its
high-performance relative to Swish, in terms of both reward-achievement and
efficiency. We focus on testing Swim on MuJoCo's locomotion continuous control
tasks since they exhibit more complex dynamics and would therefore benefit most
from a high-performing and efficient activation function. We also use the TD3
algorithm in conjunction with Swim and explain this choice in the context of
the robot locomotion domain. We then conclude that Swim is a state-of-the-art
activation function for continuous control locomotion tasks and recommend using
it with TD3 as a working framework.
Related papers
- Trainable Highly-expressive Activation Functions [8.662179223772089]
We introduce DiTAC, a trainable highly-expressive activation function.
DiTAC enhances model expressiveness and performance, often yielding substantial improvements.
It also outperforms existing activation functions (regardless of whether the latter are fixed or trainable) in tasks such as semantic segmentation, image generation, regression problems, and image classification.
arXiv Detail & Related papers (2024-07-10T11:49:29Z) - REBEL: A Regularization-Based Solution for Reward Overoptimization in Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and user intentions, values, or social norms can be catastrophic in the real world.
Current methods to mitigate this misalignment work by learning reward functions from human preferences.
We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z) - ErfReLU: Adaptive Activation Function for Deep Neural Network [1.9336815376402716]
Recent research has found that the activation function selected for adding non-linearity into the output can have a big impact on how effectively deep learning networks perform.
Researchers recently started developing activation functions that can be trained throughout the learning process.
State of art activation functions like Sigmoid, ReLU, Tanh, and their properties have been briefly explained.
arXiv Detail & Related papers (2023-06-02T13:41:47Z) - Efficient Activation Function Optimization through Surrogate Modeling [15.219959721479835]
This paper aims to improve the state of the art through three steps.
First, the benchmark Act-Bench-CNN, Act-Bench-ResNet, and Act-Bench-ViT were created by training convolutional, residual, and vision transformer architectures.
Second, a characterization of the benchmark space was developed, leading to a new surrogate-based method for optimization.
arXiv Detail & Related papers (2023-01-13T23:11:14Z) - Learning Action-Effect Dynamics for Hypothetical Vision-Language
Reasoning Task [50.72283841720014]
We propose a novel learning strategy that can improve reasoning about the effects of actions.
We demonstrate the effectiveness of our proposed approach and discuss its advantages over previous baselines in terms of performance, data efficiency, and generalization capability.
arXiv Detail & Related papers (2022-12-07T05:41:58Z) - Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data.
RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - Activation Functions: Dive into an optimal activation function [1.52292571922932]
We find an optimal activation function by defining it as a weighted sum of existing activation functions.
The study uses three activation functions, ReLU, tanh, and sin, over three popular image datasets.
arXiv Detail & Related papers (2022-02-24T12:44:11Z) - Learning Bayesian Sparse Networks with Full Experience Replay for
Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered.
Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal.
We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z) - Learning specialized activation functions with the Piecewise Linear Unit [7.820667552233989]
We propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method.
It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO.
PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.
arXiv Detail & Related papers (2021-04-08T11:29:11Z) - Learn to cycle: Time-consistent feature discovery for action recognition [83.43682368129072]
Generalizing over temporal variations is a prerequisite for effective action recognition in videos.
We introduce Squeeze Re Temporal Gates (SRTG), an approach that favors temporal activations with potential variations.
We show consistent improvement when using SRTPG blocks, with only a minimal increase in the number of GFLOs.
arXiv Detail & Related papers (2020-06-15T09:36:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.