Related papers: Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity

Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity

URL: http://arxiv.org/abs/2506.17155v2
Date: Thu, 26 Jun 2025 21:55:13 GMT
Title: Sparse-Reg: Improving Sample Complexity in Offline Reinforcement Learning using Sparsity
Authors: Samin Yeasar Arnob, Scott Fujimoto, Doina Precup,
Abstract summary: "Sparse-Reg" is a regularization technique based on sparsity to mitigate overfitting in offline reinforcement learning.<n>We show that offline RL algorithms can overfit on small datasets, resulting in poor performance.
Score: 40.998188469865184
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we investigate the use of small datasets in the context of offline reinforcement learning (RL). While many common offline RL benchmarks employ datasets with over a million data points, many offline RL applications rely on considerably smaller datasets. We show that offline RL algorithms can overfit on small datasets, resulting in poor performance. To address this challenge, we introduce "Sparse-Reg": a regularization technique based on sparsity to mitigate overfitting in offline reinforcement learning, enabling effective learning in limited data settings and outperforming state-of-the-art baselines in continuous control.

Related papers

Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset [29.573555134322543]
offline reinforcement learning (RL) allows agents to learn from pre-collected datasets without further interaction with the environment.<n>A key, yet underexplored, challenge in offline RL is selecting an optimal subset of the offline dataset.<n>We introduce ReDOR, a method that frames dataset selection as a gradient approximation optimization problem.
arXiv Detail & Related papers (2025-02-26T09:08:47Z)
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets [53.8218145723718]
offline policy learning is aimed at learning decision-making policies using existing datasets of trajectories without collecting additional data. We argue that when a dataset is dominated by suboptimal trajectories, state-of-the-art offline RL algorithms do not substantially improve over the average return of trajectories in the dataset. We present a realization of the sampling strategy and an algorithm that can be used as a plug-and-play module in standard offline RL algorithms.
arXiv Detail & Related papers (2023-10-06T17:58:14Z)
Look Beneath the Surface: Exploiting Fundamental Symmetry for Sample-Efficient Offline RL [29.885978495034703]
offline reinforcement learning (RL) offers an appealing approach to real-world tasks by learning policies from pre-collected datasets. However, the performance of existing offline RL algorithms heavily depends on the scale and state-action space coverage of datasets. We provide a new insight that leveraging the fundamental symmetry of system dynamics can substantially enhance offline RL performance under small datasets.
arXiv Detail & Related papers (2023-06-07T07:51:05Z)
Efficient Online Reinforcement Learning with Offline Data [78.92501185886569]
We show that we can simply apply existing off-policy methods to leverage offline data when learning online. We extensively ablate these design choices, demonstrating the key factors that most affect performance. We see that correct application of these simple recommendations can provide a $mathbf2.5times$ improvement over existing approaches.
arXiv Detail & Related papers (2023-02-06T17:30:22Z)
Boosting Offline Reinforcement Learning via Data Rebalancing [104.3767045977716]
offline reinforcement learning (RL) is challenged by the distributional shift between learning policies and datasets. We propose a simple yet effective method to boost offline RL algorithms based on the observation that resampling a dataset keeps the distribution support unchanged. We dub our method ReD (Return-based Data Rebalance), which can be implemented with less than 10 lines of code change and adds negligible running time.
arXiv Detail & Related papers (2022-10-17T16:34:01Z)
Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning [147.61075994259807]
We propose Exploratory data for Offline RL (ExORL), a data-centric approach to offline RL. ExORL first generates data with unsupervised reward-free exploration, then relabels this data with a downstream reward before training a policy with offline RL. We find that exploratory data allows vanilla off-policy RL algorithms, without any offline-specific modifications, to outperform or match state-of-the-art offline RL algorithms on downstream tasks.
arXiv Detail & Related papers (2022-01-31T18:39:27Z)
Critic Regularized Regression [70.8487887738354]
We propose a novel offline RL algorithm to learn policies from data using a form of critic-regularized regression (CRR) We find that CRR performs surprisingly well and scales to tasks with high-dimensional state and action spaces.
arXiv Detail & Related papers (2020-06-26T17:50:26Z)
D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL. By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.