Assessing Policy, Loss and Planning Combinations in Reinforcement
Learning using a New Modular Architecture
- URL: http://arxiv.org/abs/2201.02874v1
- Date: Sat, 8 Jan 2022 18:30:25 GMT
- Title: Assessing Policy, Loss and Planning Combinations in Reinforcement
Learning using a New Modular Architecture
- Authors: Tiago Gaspar Oliveira and Arlindo L. Oliveira
- Abstract summary: We propose a new modular software architecture suited for model-based reinforcement learning agents.
We show that the best combination of planning algorithm, policy, and loss function is heavily problem dependent.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The model-based reinforcement learning paradigm, which uses planning
algorithms and neural network models, has recently achieved unprecedented
results in diverse applications, leading to what is now known as deep
reinforcement learning. These agents are quite complex and involve multiple
components, factors that can create challenges for research. In this work, we
propose a new modular software architecture suited for these types of agents,
and a set of building blocks that can be easily reused and assembled to
construct new model-based reinforcement learning agents. These building blocks
include planning algorithms, policies, and loss functions.
We illustrate the use of this architecture by combining several of these
building blocks to implement and test agents that are optimized to three
different test environments: Cartpole, Minigrid, and Tictactoe. One particular
planning algorithm, made available in our implementation and not previously
used in reinforcement learning, which we called averaged minimax, achieved good
results in the three tested environments.
Experiments performed with this architecture have shown that the best
combination of planning algorithm, policy, and loss function is heavily problem
dependent. This result provides evidence that the proposed architecture, which
is modular and reusable, is useful for reinforcement learning researchers who
want to study new environments and techniques.
Related papers
- Task Agnostic Architecture for Algorithm Induction via Implicit Composition [10.627575117586417]
This position paper aims to explore developing such a unified architecture and proposes a theoretical framework of how it could be constructed.
Recent Generative AI, especially Transformer-based models, demonstrate potential as an architecture capable of constructing algorithms for a wide range of domains.
Our exploration delves into current capabilities and limitations of Transformer-based and other methods in efficient and correct algorithm composition.
arXiv Detail & Related papers (2024-04-03T04:31:09Z) - RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation
Research [0.0]
We introduce RLOR, a flexible framework for Deep Reinforcement Learning for Operation Research.
We analyze the end-to-end autoregressive models for vehicle routing problems and show that these models can benefit from the recent advances in reinforcement learning.
arXiv Detail & Related papers (2023-03-23T09:07:30Z) - Modular Deep Learning [120.36599591042908]
Transfer learning has recently become the dominant paradigm of machine learning.
It remains unclear how to develop models that specialise towards multiple tasks without incurring negative interference.
Modular deep learning has emerged as a promising solution to these challenges.
arXiv Detail & Related papers (2023-02-22T18:11:25Z) - GLUECons: A Generic Benchmark for Learning Under Constraints [102.78051169725455]
In this work, we create a benchmark that is a collection of nine tasks in the domains of natural language processing and computer vision.
We model external knowledge as constraints, specify the sources of the constraints for each task, and implement various models that use these constraints.
arXiv Detail & Related papers (2023-02-16T16:45:36Z) - POPNASv3: a Pareto-Optimal Neural Architecture Search Solution for Image
and Time Series Classification [8.190723030003804]
This article presents the third version of a sequential model-based NAS algorithm targeting different hardware environments and multiple classification tasks.
Our method is able to find competitive architectures within large search spaces, while keeping a flexible structure and data processing pipeline to adapt to different tasks.
The experiments performed on images and time series classification datasets provide evidence that POPNASv3 can explore a large set of assorted operators and converge to optimal architectures suited for the type of data provided under different scenarios.
arXiv Detail & Related papers (2022-12-13T17:14:14Z) - Pareto-aware Neural Architecture Generation for Diverse Computational
Budgets [94.27982238384847]
Existing methods often perform an independent architecture search process for each target budget.
We propose a Neural Architecture Generator (PNAG) which only needs to be trained once and dynamically produces the optimal architecture for any given budget via inference.
Such a joint search algorithm not only greatly reduces the overall search cost but also improves the results.
arXiv Detail & Related papers (2022-10-14T08:30:59Z) - Policy Architectures for Compositional Generalization in Control [71.61675703776628]
We introduce a framework for modeling entity-based compositional structure in tasks.
Our policies are flexible and can be trained end-to-end without requiring any action primitives.
arXiv Detail & Related papers (2022-03-10T06:44:24Z) - Differentiable Architecture Pruning for Transfer Learning [6.935731409563879]
We propose a gradient-based approach for extracting sub-architectures from a given large model.
Our architecture-pruning scheme produces transferable new structures that can be successfully retrained to solve different tasks.
We provide theoretical convergence guarantees and validate the proposed transfer-learning strategy on real data.
arXiv Detail & Related papers (2021-07-07T17:44:59Z) - Redefining Neural Architecture Search of Heterogeneous Multi-Network
Models by Characterizing Variation Operators and Model Components [71.03032589756434]
We investigate the effect of different variation operators in a complex domain, that of multi-network heterogeneous neural models.
We characterize both the variation operators, according to their effect on the complexity and performance of the model; and the models, relying on diverse metrics which estimate the quality of the different parts composing it.
arXiv Detail & Related papers (2021-06-16T17:12:26Z) - Revealing the Invisible with Model and Data Shrinking for
Composite-database Micro-expression Recognition [49.463864096615254]
We analyze the influence of learning complexity, including the input complexity and model complexity.
We propose a recurrent convolutional network (RCN) to explore the shallower-architecture and lower-resolution input data.
We develop three parameter-free modules to integrate with RCN without increasing any learnable parameters.
arXiv Detail & Related papers (2020-06-17T06:19:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.