Related papers: Swim: A General-Purpose, High-Performing, and Efficient Activation Function for Locomotion Control Tasks

Swim: A General-Purpose, High-Performing, and Efficient Activation Function for Locomotion Control Tasks

URL: http://arxiv.org/abs/2303.02640v1
Date: Sun, 5 Mar 2023 11:04:33 GMT
Title: Swim: A General-Purpose, High-Performing, and Efficient Activation Function for Locomotion Control Tasks
Authors: Maryam Abdool and Tony Dear
Abstract summary: Activation functions play a significant role in the performance of deep learning algorithms. In particular, the Swish activation function tends to outperform ReLU on deeper models. We propose Swim, a general-purpose, efficient, and high-performing alternative to Swish.
Score: 0.2538209532048866
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Activation functions play a significant role in the performance of deep learning algorithms. In particular, the Swish activation function tends to outperform ReLU on deeper models, including deep reinforcement learning models, across challenging tasks. Despite this progress, ReLU is the preferred function partly because it is more efficient than Swish. Furthermore, in contrast to the fields of computer vision and natural language processing, the deep reinforcement learning and robotics domains have seen less inclination to adopt new activation functions, such as Swish, and instead continue to use more traditional functions, like ReLU. To tackle those issues, we propose Swim, a general-purpose, efficient, and high-performing alternative to Swish, and then provide an analysis of its properties as well as an explanation for its high-performance relative to Swish, in terms of both reward-achievement and efficiency. We focus on testing Swim on MuJoCo's locomotion continuous control tasks since they exhibit more complex dynamics and would therefore benefit most from a high-performing and efficient activation function. We also use the TD3 algorithm in conjunction with Swim and explain this choice in the context of the robot locomotion domain. We then conclude that Swim is a state-of-the-art activation function for continuous control locomotion tasks and recommend using it with TD3 as a working framework.

Related papers

Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learning [59.001091197106085]
Multi-Task Learning (MTL) for Vision Transformer aims at enhancing the model capability by tackling multiple tasks simultaneously. Most recent works have predominantly focused on designing Mixture-of-Experts (MoE) structures and in tegrating Low-Rank Adaptation (LoRA) to efficiently perform multi-task learning. We propose a novel approach dubbed Efficient Multi-Task Learning (EMTAL) by transforming a pre-trained Vision Transformer into an efficient multi-task learner.
arXiv Detail & Related papers (2025-01-12T17:41:23Z)
Reinforcement Learning with Action Sequence for Data-Efficient Robot Learning [62.3886343725955]
We introduce a novel RL algorithm that learns a critic network that outputs Q-values over a sequence of actions. By explicitly training the value functions to learn the consequence of executing a series of current and future actions, our algorithm allows for learning useful value functions from noisy trajectories.
arXiv Detail & Related papers (2024-11-19T01:23:52Z)
Trainable Highly-expressive Activation Functions [8.662179223772089]
We introduce DiTAC, a trainable highly-expressive activation function. DiTAC enhances model expressiveness and performance, often yielding substantial improvements. It also outperforms existing activation functions (regardless of whether the latter are fixed or trainable) in tasks such as semantic segmentation, image generation, regression problems, and image classification.
arXiv Detail & Related papers (2024-07-10T11:49:29Z)
RILe: Reinforced Imitation Learning [60.63173816209543]
RILe (Reinforced Learning) is a framework that combines the strengths of imitation learning and inverse reinforcement learning to learn a dense reward function efficiently. Our framework produces high-performing policies in high-dimensional tasks where direct imitation fails to replicate complex behaviors.
arXiv Detail & Related papers (2024-06-12T17:56:31Z)
REBEL: A Regularization-Based Solution for Reward Overoptimization in Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and user intentions, values, or social norms can be catastrophic in the real world. Current methods to mitigate this misalignment work by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
ErfReLU: Adaptive Activation Function for Deep Neural Network [1.9336815376402716]
Recent research has found that the activation function selected for adding non-linearity into the output can have a big impact on how effectively deep learning networks perform. Researchers recently started developing activation functions that can be trained throughout the learning process. State of art activation functions like Sigmoid, ReLU, Tanh, and their properties have been briefly explained.
arXiv Detail & Related papers (2023-06-02T13:41:47Z)
Efficient Activation Function Optimization through Surrogate Modeling [15.219959721479835]
This paper aims to improve the state of the art through three steps. First, the benchmark Act-Bench-CNN, Act-Bench-ResNet, and Act-Bench-ViT were created by training convolutional, residual, and vision transformer architectures. Second, a characterization of the benchmark space was developed, leading to a new surrogate-based method for optimization.
arXiv Detail & Related papers (2023-01-13T23:11:14Z)
Learning Action-Effect Dynamics for Hypothetical Vision-Language Reasoning Task [50.72283841720014]
We propose a novel learning strategy that can improve reasoning about the effects of actions. We demonstrate the effectiveness of our proposed approach and discuss its advantages over previous baselines in terms of performance, data efficiency, and generalization capability.
arXiv Detail & Related papers (2022-12-07T05:41:58Z)
Transformers with Learnable Activation Functions [63.98696070245065]
We use Rational Activation Function (RAF) to learn optimal activation functions during training according to input data. RAF opens a new research direction for analyzing and interpreting pre-trained models according to the learned activation functions.
arXiv Detail & Related papers (2022-08-30T09:47:31Z)
Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior. This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z)
Activation Functions: Dive into an optimal activation function [1.52292571922932]
We find an optimal activation function by defining it as a weighted sum of existing activation functions. The study uses three activation functions, ReLU, tanh, and sin, over three popular image datasets.
arXiv Detail & Related papers (2022-02-24T12:44:11Z)
Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning [54.7584721943286]
Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered. Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal. We propose to only activate and select sparse neurons for learning current and past tasks at any stage.
arXiv Detail & Related papers (2022-02-21T13:25:03Z)
Learning specialized activation functions with the Piecewise Linear Unit [7.820667552233989]
We propose a new activation function called Piecewise Linear Unit(PWLU), which incorporates a carefully designed formulation and learning method. It can learn specialized activation functions and achieves SOTA performance on large-scale datasets like ImageNet and COCO. PWLU is also easy to implement and efficient at inference, which can be widely applied in real-world applications.
arXiv Detail & Related papers (2021-04-08T11:29:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.