Related papers: Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking

Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking

URL: http://arxiv.org/abs/2011.07537v2
Date: Wed, 19 May 2021 12:28:33 GMT
Title: Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking
Authors: Fabio Pardo
Abstract summary: Deep reinforcement learning has been one of the fastest growing fields of machine learning over the past years and numerous libraries have been open sourced to support research. This paper introduces Tonic, a Python library allowing researchers to quickly implement new ideas and measure their importance.
Score: 4.721069729610892
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep reinforcement learning has been one of the fastest growing fields of machine learning over the past years and numerous libraries have been open sourced to support research. However, most codebases have a steep learning curve or limited flexibility that do not satisfy a need for fast prototyping in fundamental research. This paper introduces Tonic, a Python library allowing researchers to quickly implement new ideas and measure their importance by providing: 1) general-purpose configurable modules 2) several baseline agents: A2C, TRPO, PPO, MPO, DDPG, D4PG, TD3 and SAC built with these modules 3) support for TensorFlow 2 and PyTorch 4) support for continuous-control environments from OpenAI Gym, DeepMind Control Suite and PyBullet 5) scripts to experiment in a reproducible way, plot results, and play with trained agents 6) a benchmark of the provided agents on 70 continuous-control tasks. Evaluation is performed in fair conditions with identical seeds, training and testing loops, while sharing general improvements such as non-terminal timeouts and observation normalization. Finally, to demonstrate how Tonic simplifies experimentation, a novel agent called TD4 is implemented and evaluated.

Related papers

ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning [78.42927884000673]
ExACT is an approach to combine test-time search and self-learning to build o1-like models for agentic applications. We first introduce Reflective Monte Carlo Tree Search (R-MCTS), a novel test time algorithm designed to enhance AI agents' ability to explore decision space on the fly. Next, we introduce Exploratory Learning, a novel learning strategy to teach agents to search at inference time without relying on any external search algorithms.
arXiv Detail & Related papers (2024-10-02T21:42:35Z)
Semantic Residual Prompts for Continual Learning [21.986800282078498]
We show that our method significantly outperforms both state-of-the-art CL approaches and the zero-shot CLIP test. Our findings hold true even for datasets with a substantial domain gap w.r.t. the pre-training knowledge of the backbone model.
arXiv Detail & Related papers (2024-03-11T16:23:38Z)
PyPOTS: A Python Toolbox for Data Mining on Partially-Observed Time Series [0.0]
PyPOTS is an open-source Python library dedicated to data mining and analysis on partially-observed time series. It provides easy access to diverse algorithms categorized into four tasks: imputation, classification, clustering, and forecasting.
arXiv Detail & Related papers (2023-05-30T07:57:05Z)
SequeL: A Continual Learning Library in PyTorch and JAX [50.33956216274694]
SequeL is a library for Continual Learning that supports both PyTorch and JAX frameworks. It provides a unified interface for a wide range of Continual Learning algorithms, including regularization-based approaches, replay-based approaches, and hybrid approaches. We release SequeL as an open-source library, enabling researchers and developers to easily experiment and extend the library for their own purposes.
arXiv Detail & Related papers (2023-04-21T10:00:22Z)
Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond [52.656743602538825]
Fine-tuning pre-trained code models incurs a large computational cost. We conduct an experimental study to explore what happens to layer-wise pre-trained representations and their encoded code knowledge during fine-tuning. We propose Telly to efficiently fine-tune pre-trained code models via layer freezing.
arXiv Detail & Related papers (2023-04-11T13:34:13Z)
PyRelationAL: a python library for active learning research and development [1.0061110876649197]
Active learning (AL) is a sub-field of ML focused on the development of methods to iteratively and economically acquire data. Here, we introduce PyRelationAL, an open source library for AL research. We describe a modular toolkit based around a two step design methodology for composing pool-based active learning strategies.
arXiv Detail & Related papers (2022-05-23T08:21:21Z)
MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation [104.48766162008815]
We propose and explore a new multi-modal extension of test-time adaptation for 3D semantic segmentation. To design a framework that can take full advantage of multi-modality, each modality provides regularized self-supervisory signals to other modalities. Our regularized pseudo labels produce stable self-learning signals in numerous multi-modal test-time adaptation scenarios.
arXiv Detail & Related papers (2022-04-27T02:28:12Z)
Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior. The retrieval process is trained to retrieve information from the dataset that may be useful in the current context. We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z)
A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning [60.720251418816815]
We present a large-scale study on unsupervised representation learning from videos. Our objective encourages temporally-persistent features in the same video. We find that encouraging long-spanned persistency can be effective even if the timespan is 60 seconds.
arXiv Detail & Related papers (2021-04-29T17:59:53Z)
Podracer architectures for scalable Reinforcement Learning [23.369001500657028]
How to best train reinforcement learning (RL) agents at scale is still an active research area. In this report we argue that TPUs are particularly well suited for training RL agents in a scalable, efficient and reproducible way.
arXiv Detail & Related papers (2021-04-13T15:05:35Z)
Reinforcement Learning for Control of Valves [0.0]
This paper is a study of reinforcement learning (RL) as an optimal-control strategy for control of nonlinear valves. It is evaluated against the PID (proportional-integral-derivative) strategy, using a unified framework.
arXiv Detail & Related papers (2020-12-29T09:01:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.