ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2309.03081v1
- Date: Wed, 6 Sep 2023 15:28:43 GMT
- Title: ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement Learning
- Authors: Linkang Du, Min Chen, Mingyang Sun, Shouling Ji, Peng Cheng, Jiming
Chen, Zhikun Zhang
- Abstract summary: offline deep reinforcement learning ( offline DRL) is frequently used to train models on pre-collected datasets.
We propose ORL-AUDITOR, which is the first trajectory-level dataset auditing mechanism for offline DRL scenarios.
Our experiments on multiple offline DRL models and tasks reveal the efficacy of ORL-AUDITOR, with auditing accuracy over 95% and false positive rates less than 2.88%.
- Score: 42.87245000172943
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Data is a critical asset in AI, as high-quality datasets can significantly
improve the performance of machine learning models. In safety-critical domains
such as autonomous vehicles, offline deep reinforcement learning (offline DRL)
is frequently used to train models on pre-collected datasets, as opposed to
training these models by interacting with the real-world environment as the
online DRL. To support the development of these models, many institutions make
datasets publicly available with opensource licenses, but these datasets are at
risk of potential misuse or infringement. Injecting watermarks to the dataset
may protect the intellectual property of the data, but it cannot handle
datasets that have already been published and is infeasible to be altered
afterward. Other existing solutions, such as dataset inference and membership
inference, do not work well in the offline DRL scenario due to the diverse
model behavior characteristics and offline setting constraints. In this paper,
we advocate a new paradigm by leveraging the fact that cumulative rewards can
act as a unique identifier that distinguishes DRL models trained on a specific
dataset. To this end, we propose ORL-AUDITOR, which is the first
trajectory-level dataset auditing mechanism for offline RL scenarios. Our
experiments on multiple offline DRL models and tasks reveal the efficacy of
ORL-AUDITOR, with auditing accuracy over 95% and false positive rates less than
2.88%. We also provide valuable insights into the practical implementation of
ORL-AUDITOR by studying various parameter settings. Furthermore, we demonstrate
the auditing capability of ORL-AUDITOR on open-source datasets from Google and
DeepMind, highlighting its effectiveness in auditing published datasets.
ORL-AUDITOR is open-sourced at https://github.com/link-zju/ORL-Auditor.
Related papers
- D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - Utilizing Explainability Techniques for Reinforcement Learning Model
Assurance [42.302469854610315]
Explainable Reinforcement Learning (XRL) can provide transparency into the decision-making process of a Deep Reinforcement Learning (DRL) model.
This paper introduces the ARLIN (Assured RL Model Interrogation) Toolkit, an open-source Python library that identifies potential vulnerabilities and critical points within trained DRL models.
arXiv Detail & Related papers (2023-11-27T14:02:47Z) - Semi-Supervised Offline Reinforcement Learning with Action-Free
Trajectories [37.14064734165109]
Natural agents can learn from multiple data sources that differ in size, quality, and types of measurements.
We study this in the context of offline reinforcement learning (RL) by introducing a new, practically motivated semi-supervised setting.
arXiv Detail & Related papers (2022-10-12T18:22:23Z) - Don't Change the Algorithm, Change the Data: Exploratory Data for
Offline Reinforcement Learning [147.61075994259807]
We propose Exploratory data for Offline RL (ExORL), a data-centric approach to offline RL.
ExORL first generates data with unsupervised reward-free exploration, then relabels this data with a downstream reward before training a policy with offline RL.
We find that exploratory data allows vanilla off-policy RL algorithms, without any offline-specific modifications, to outperform or match state-of-the-art offline RL algorithms on downstream tasks.
arXiv Detail & Related papers (2022-01-31T18:39:27Z) - Offline Reinforcement Learning from Images with Latent Space Models [60.69745540036375]
offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions.
We build on recent advances in model-based algorithms for offline RL, and extend them to high-dimensional visual observation spaces.
Our approach is both tractable in practice and corresponds to maximizing a lower bound of the ELBO in the unknown POMDP.
arXiv Detail & Related papers (2020-12-21T18:28:17Z) - RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning [108.9599280270704]
We propose a benchmark called RL Unplugged to evaluate and compare offline RL methods.
RL Unplugged includes data from a diverse range of domains including games and simulated motor control problems.
We will release data for all our tasks and open-source all algorithms presented in this paper.
arXiv Detail & Related papers (2020-06-24T17:14:51Z) - D4RL: Datasets for Deep Data-Driven Reinforcement Learning [119.49182500071288]
We introduce benchmarks specifically designed for the offline setting, guided by key properties of datasets relevant to real-world applications of offline RL.
By moving beyond simple benchmark tasks and data collected by partially-trained RL agents, we reveal important and unappreciated deficiencies of existing algorithms.
arXiv Detail & Related papers (2020-04-15T17:18:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.