adaPARL: Adaptive Privacy-Aware Reinforcement Learning for
Sequential-Decision Making Human-in-the-Loop Systems
- URL: http://arxiv.org/abs/2303.04257v1
- Date: Tue, 7 Mar 2023 21:55:22 GMT
- Title: adaPARL: Adaptive Privacy-Aware Reinforcement Learning for
Sequential-Decision Making Human-in-the-Loop Systems
- Authors: Mojtaba Taherisadr, Stelios Andrew Stavroulakis, Salma Elmalaki
- Abstract summary: Reinforcement learning (RL) presents numerous benefits compared to rule-based approaches in various applications.
We propose adaPARL, an adaptive approach for privacy-aware RL, especially for human-in-the-loop IoT systems.
AdaPARL provides a personalized privacy-utility trade-off depending on human behavior and preference.
- Score: 0.5414308305392761
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (RL) presents numerous benefits compared to rule-based
approaches in various applications. Privacy concerns have grown with the
widespread use of RL trained with privacy-sensitive data in IoT devices,
especially for human-in-the-loop systems. On the one hand, RL methods enhance
the user experience by trying to adapt to the highly dynamic nature of humans.
On the other hand, trained policies can leak the user's private information.
Recent attention has been drawn to designing privacy-aware RL algorithms while
maintaining an acceptable system utility. A central challenge in designing
privacy-aware RL, especially for human-in-the-loop systems, is that humans have
intrinsic variability and their preferences and behavior evolve. The effect of
one privacy leak mitigation can be different for the same human or across
different humans over time. Hence, we can not design one fixed model for
privacy-aware RL that fits all. To that end, we propose adaPARL, an adaptive
approach for privacy-aware RL, especially for human-in-the-loop IoT systems.
adaPARL provides a personalized privacy-utility trade-off depending on human
behavior and preference. We validate the proposed adaPARL on two IoT
applications, namely (i) Human-in-the-Loop Smart Home and (ii)
Human-in-the-Loop Virtual Reality (VR) Smart Classroom. Results obtained on
these two applications validate the generality of adaPARL and its ability to
provide a personalized privacy-utility trade-off. On average, for the first
application, adaPARL improves the utility by $57\%$ over the baseline and by
$43\%$ over randomization. adaPARL also reduces the privacy leak by $23\%$ on
average. For the second application, adaPARL decreases the privacy leak to
$44\%$ before the utility drops by $15\%$.
Related papers
- Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied.
Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - PAPER-HILT: Personalized and Adaptive Privacy-Aware Early-Exit for
Reinforcement Learning in Human-in-the-Loop Systems [0.6282068591820944]
Reinforcement Learning (RL) has increasingly become a preferred method over traditional rule-based systems in diverse human-in-the-loop (HITL) applications.
This paper focuses on developing an innovative, adaptive RL strategy through exploiting an early-exit approach designed explicitly for privacy preservation in HITL environments.
arXiv Detail & Related papers (2024-03-09T10:24:12Z) - Contrastive Preference Learning: Learning from Human Feedback without RL [71.77024922527642]
We introduce Contrastive Preference Learning (CPL), an algorithm for learning optimal policies from preferences without learning reward functions.
CPL is fully off-policy, uses only a simple contrastive objective, and can be applied to arbitrary MDPs.
arXiv Detail & Related papers (2023-10-20T16:37:56Z) - Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework [6.828884629694705]
This article proposes the conceptual model called PrivChatGPT, a privacy-generative model for LLMs.
PrivChatGPT consists of two main components i.e., preserving user privacy during the data curation/pre-processing together with preserving private context and the private training process for large-scale data.
arXiv Detail & Related papers (2023-10-19T06:55:13Z) - TeD-SPAD: Temporal Distinctiveness for Self-supervised
Privacy-preservation for video Anomaly Detection [59.04634695294402]
Video anomaly detection (VAD) without human monitoring is a complex computer vision task.
Privacy leakage in VAD allows models to pick up and amplify unnecessary biases related to people's personal information.
We propose TeD-SPAD, a privacy-aware video anomaly detection framework that destroys visual private information in a self-supervised manner.
arXiv Detail & Related papers (2023-08-21T22:42:55Z) - On Differential Privacy for Federated Learning in Wireless Systems with
Multiple Base Stations [90.53293906751747]
We consider a federated learning model in a wireless system with multiple base stations and inter-cell interference.
We show the convergence behavior of the learning process by deriving an upper bound on its optimality gap.
Our proposed scheduler improves the average accuracy of the predictions compared with a random scheduler.
arXiv Detail & Related papers (2022-08-25T03:37:11Z) - Production of Categorical Data Verifying Differential Privacy:
Conception and Applications to Machine Learning [0.0]
Differential privacy is a formal definition that allows quantifying the privacy-utility trade-off.
With the local DP (LDP) model, users can sanitize their data locally before transmitting it to the server.
In all cases, we concluded that differentially private ML models achieve nearly the same utility metrics as non-private ones.
arXiv Detail & Related papers (2022-04-02T12:50:14Z) - Privacy-Preserving Reinforcement Learning Beyond Expectation [6.495883501989546]
Cyber and cyber-physical systems equipped with machine learning algorithms such as autonomous cars share environments with humans.
It is important to align system (or agent) behaviors with the preferences of one or more human users.
We consider the case when an agent has to learn behaviors in an unknown environment.
arXiv Detail & Related papers (2022-03-18T21:28:29Z) - Adaptive Control of Differentially Private Linear Quadratic Systems [5.414308305392762]
We study the problem of regret in reinforcement learning (RL) under differential privacy constraints.
We develop the first private RL algorithm, PRL, which is able to attain a sub-linear regret while guaranteeing privacy protection.
arXiv Detail & Related papers (2021-08-26T03:06:22Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.