Model-based actor-critic: GAN (model generator) + DRL (actor-critic) =>
AGI
- URL: http://arxiv.org/abs/2004.04574v9
- Date: Tue, 20 Apr 2021 22:09:26 GMT
- Title: Model-based actor-critic: GAN (model generator) + DRL (actor-critic) =>
AGI
- Authors: Aras Dargazany
- Abstract summary: We propose adding an (generative/predictive) environment model to the actor-critic (model-free) architecture.
The proposed AI model is similar to (model-free) DDPG and therefore it's called model-based DDPG.
Our initial limited experiments show that DRL and GAN in model-based actor-critic results in an incremental goal-driven intellignce required to solve each task with similar performance to (model-free) DDPG.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our effort is toward unifying GAN and DRL algorithms into a unifying AI model
(AGI or general-purpose AI or artificial general intelligence which has
general-purpose applications to: (A) offline learning (of stored data) like GAN
in (un/semi-/fully-)SL setting such as big data analytics (mining) and
visualization; (B) online learning (of real or simulated devices) like DRL in
RL setting (with/out environment reward) such as (real or simulated) robotics
and control; Our core proposal is adding an (generative/predictive) environment
model to the actor-critic (model-free) architecture which results in a
model-based actor-critic architecture with temporal-differencing (TD) error and
an episodic memory. The proposed AI model is similar to (model-free) DDPG and
therefore it's called model-based DDPG. To evaluate it, we compare it with
(model-free) DDPG by applying them both to a variety (wide range) of
independent simulated robotic and control task environments in OpenAI Gym and
Unity Agents. Our initial limited experiments show that DRL and GAN in
model-based actor-critic results in an incremental goal-driven intellignce
required to solve each task with similar performance to (model-free) DDPG. Our
future focus is to investigate the proposed AI model potential to: (A) unify
DRL field inside AI by producing competitive performance compared to the best
of model-based (PlaNet) and model-free (D4PG) approaches; (B) bridge the gap
between AI and robotics communities by solving the important problem of reward
engineering with learning the reward function by demonstration.
Related papers
- Generative Diffusion-based Contract Design for Efficient AI Twins Migration in Vehicular Embodied AI Networks [55.15079732226397]
Embodied AI is a rapidly advancing field that bridges the gap between cyberspace and physical space.
In VEANET, embodied AI twins act as in-vehicle AI assistants to perform diverse tasks supporting autonomous driving.
arXiv Detail & Related papers (2024-10-02T02:20:42Z) - Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning [50.332027356848094]
AI-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control.
The mapping between context and AI model parameters is ideally done in a zero-shot fashion.
This paper introduces a general methodology for the online optimization of AMS mappings.
arXiv Detail & Related papers (2024-06-22T11:17:50Z) - Model Callers for Transforming Predictive and Generative AI Applications [2.7195102129095003]
We introduce a novel software abstraction termed "model caller"
Model callers act as an intermediary for AI and ML model calling.
We have released a prototype Python library for model callers, accessible for installation via pip or for download from GitHub.
arXiv Detail & Related papers (2024-04-17T12:21:06Z) - STORM: Efficient Stochastic Transformer based World Models for
Reinforcement Learning [82.03481509373037]
Recently, model-based reinforcement learning algorithms have demonstrated remarkable efficacy in visual input environments.
We introduce Transformer-based wORld Model (STORM), an efficient world model architecture that combines strong modeling and generation capabilities.
Storm achieves a mean human performance of $126.7%$ on the Atari $100$k benchmark, setting a new record among state-of-the-art methods.
arXiv Detail & Related papers (2023-10-14T16:42:02Z) - Physics-Informed Model-Based Reinforcement Learning [19.01626581411011]
One of the drawbacks of traditional reinforcement learning algorithms is their poor sample efficiency.
We learn a model of the environment, essentially its transition dynamics and reward function, use it to generate imaginary trajectories and backpropagate through them to update the policy.
We show that, in model-based RL, model accuracy mainly matters in environments that are sensitive to initial conditions.
We also show that, in challenging environments, physics-informed model-based RL achieves better average-return than state-of-the-art model-free RL algorithms.
arXiv Detail & Related papers (2022-12-05T11:26:10Z) - SAM-RL: Sensing-Aware Model-Based Reinforcement Learning via
Differentiable Physics-Based Simulation and Rendering [49.78647219715034]
We propose a sensing-aware model-based reinforcement learning system called SAM-RL.
With the sensing-aware learning pipeline, SAM-RL allows a robot to select an informative viewpoint to monitor the task process.
We apply our framework to real world experiments for accomplishing three manipulation tasks: robotic assembly, tool manipulation, and deformable object manipulation.
arXiv Detail & Related papers (2022-10-27T05:30:43Z) - Simplifying Model-based RL: Learning Representations, Latent-space
Models, and Policies with One Objective [142.36200080384145]
We propose a single objective which jointly optimize a latent-space model and policy to achieve high returns while remaining self-consistent.
We demonstrate that the resulting algorithm matches or improves the sample-efficiency of the best prior model-based and model-free RL methods.
arXiv Detail & Related papers (2022-09-18T03:51:58Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - Application of Federated Learning in Building a Robust COVID-19 Chest
X-ray Classification Model [0.0]
Federated Learning (FL) helps AI models to generalize better without moving all the data to a central server.
We trained a deep learning model to solve a binary classification problem of predicting the presence or absence of COVID-19.
arXiv Detail & Related papers (2022-04-22T05:21:50Z) - Bellman: A Toolbox for Model-Based Reinforcement Learning in TensorFlow [14.422129911404472]
Bellman aims to fill this gap and introduces the first thoroughly designed and tested model-based RL toolbox.
Our modular approach enables to combine a wide range of environment models with generic model-based agent classes that recover state-of-the-art algorithms.
arXiv Detail & Related papers (2021-03-26T11:32:27Z) - Sim-Env: Decoupling OpenAI Gym Environments from Simulation Models [0.0]
Reinforcement learning (RL) is one of the most active fields of AI research.
Development methodology still lags behind, with a severe lack of standard APIs to foster the development of RL applications.
We present a workflow and tools for the decoupled development and maintenance of multi-purpose agent-based models and derived single-purpose reinforcement learning environments.
arXiv Detail & Related papers (2021-02-19T09:25:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.