Related papers: EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data

EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data

URL: http://arxiv.org/abs/2403.00564v2
Date: Thu, 12 Sep 2024 08:37:27 GMT
Title: EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Authors: Shengjie Wang, Shaohuai Liu, Weirui Ye, Jiacheng You, Yang Gao,
Abstract summary: We introduce EfficientZero V2, a framework designed for sample-efficient Reinforcement Learning (RL) algorithms. With a series of improvements, EfficientZero V2 outperforms the current state-of-the-art (SOTA) by a significant margin in diverse tasks. EfficientZero V2 exhibits a notable advancement over the prevailing general algorithm, DreamerV3, achieving superior outcomes in 50 of 66 evaluated tasks.
Score: 22.621203162457018
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sample efficiency remains a crucial challenge in applying Reinforcement Learning (RL) to real-world tasks. While recent algorithms have made significant strides in improving sample efficiency, none have achieved consistently superior performance across diverse domains. In this paper, we introduce EfficientZero V2, a general framework designed for sample-efficient RL algorithms. We have expanded the performance of EfficientZero to multiple domains, encompassing both continuous and discrete actions, as well as visual and low-dimensional inputs. With a series of improvements we propose, EfficientZero V2 outperforms the current state-of-the-art (SOTA) by a significant margin in diverse tasks under the limited data setting. EfficientZero V2 exhibits a notable advancement over the prevailing general algorithm, DreamerV3, achieving superior outcomes in 50 of 66 evaluated tasks across diverse benchmarks, such as Atari 100k, Proprio Control, and Vision Control.

Related papers

OminiControl2: Efficient Conditioning for Diffusion Transformers [68.3243031301164]
We present OminiControl2, an efficient framework that achieves efficient image-conditional image generation. OminiControl2 introduces two key innovations: (1) a dynamic compression strategy that streamlines conditional inputs by preserving only the most semantically relevant tokens during generation, and (2) a conditional feature reuse mechanism that computes condition token features only once and reuses them across denoising steps.
arXiv Detail & Related papers (2025-03-11T10:50:14Z)
FLARES: Fast and Accurate LiDAR Multi-Range Semantic Segmentation [52.89847760590189]
3D scene understanding is a critical yet challenging task in autonomous driving. Recent methods leverage the range-view representation to improve processing efficiency. We re-design the workflow for range-view-based LiDAR semantic segmentation.
arXiv Detail & Related papers (2025-02-13T12:39:26Z)
Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks [37.41604955004456]
Graph neural networks (GNNs) have demonstrated remarkable success in graph representation learning. Various sampling approaches have been proposed to scale GNNs to applications with large-scale graphs.
arXiv Detail & Related papers (2024-10-07T18:29:02Z)
UniZero: Generalized and Efficient Planning with Scalable Latent World Models [29.648382211926364]
UniZero is a novel approach that employs a modular transformer-based world model to effectively learn a shared latent space. We show that UniZero significantly outperforms existing baselines in benchmarks that require long-term memory. In standard single-task RL settings, such as Atari and DMControl, UniZero matches or even surpasses the performance of current state-of-the-art methods.
arXiv Detail & Related papers (2024-06-15T15:24:15Z)
YOLOv10: Real-Time End-to-End Object Detection [68.28699631793967]
YOLOs have emerged as the predominant paradigm in the field of real-time object detection. The reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs. We introduce the holistic efficiency-accuracy driven model design strategy for YOLOs.
arXiv Detail & Related papers (2024-05-23T11:44:29Z)
Efficient Modulation for Vision Networks [122.1051910402034]
We propose efficient modulation, a novel design for efficient vision networks. We demonstrate that the modulation mechanism is particularly well suited for efficient networks. Our network can accomplish better trade-offs between accuracy and efficiency.
arXiv Detail & Related papers (2024-03-29T03:48:35Z)
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications [108.44482683870888]
We introduce Deformable Convolution v4 (DCNv4), a highly efficient and effective operator designed for a broad spectrum of vision applications. DCNv4 addresses the limitations of its predecessor, DCNv3, with two key enhancements. It demonstrates exceptional performance across various tasks, including image classification, instance and semantic segmentation, and notably, image generation.
arXiv Detail & Related papers (2024-01-11T14:53:24Z)
MIND: Multi-Task Incremental Network Distillation [45.74830585715129]
In this study, we present MIND, a parameter isolation method that aims to significantly enhance the performance of replay-free solutions. Our results showcase the superior performance of MIND indicating its potential for addressing the challenges posed by Class-incremental and Domain-Incremental learning.
arXiv Detail & Related papers (2023-12-05T17:46:52Z)
Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning [57.83232242068982]
Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms. It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL. This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
arXiv Detail & Related papers (2023-05-25T15:46:20Z)
Planning for Sample Efficient Imitation Learning [52.44953015011569]
Current imitation algorithms struggle to achieve high performance and high in-environment sample efficiency simultaneously. We propose EfficientImitate, a planning-based imitation learning method that can achieve high in-environment sample efficiency and performance simultaneously. Experimental results show that EI achieves state-of-the-art results in performance and sample efficiency.
arXiv Detail & Related papers (2022-10-18T05:19:26Z)
An Efficiency Study for SPLADE Models [5.725475501578801]
In this paper, we focus on improving the efficiency of the SPLADE model. We propose several techniques including L1 regularization for queries, a separation of document/ encoders, a FLOPS-regularized middle-training, and the use of faster query encoders.
arXiv Detail & Related papers (2022-07-08T11:42:05Z)
Mastering Atari Games with Limited Data [73.6189496825209]
We propose a sample efficient model-based visual RL algorithm built on MuZero, which we name EfficientZero. Our method achieves 190.4% mean human performance on the Atari 100k benchmark with only two hours of real-time game experience. This is the first time an algorithm achieves super-human performance on Atari games with such little data.
arXiv Detail & Related papers (2021-10-30T09:13:39Z)
Measuring Progress in Deep Reinforcement Learning Sample Efficiency [0.0]
Current benchmarks allow for the cheap and easy generation of large amounts of samples. As simulating real world processes is often prohibitively hard and collecting real world experience is costly, sample efficiency is an important indicator for economically relevant applications of DRL. We investigate progress in sample efficiency on Atari games and continuous control tasks by comparing the number of samples that a variety of algorithms need to reach a given performance level.
arXiv Detail & Related papers (2021-02-09T15:27:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.