Related papers: adaPARL: Adaptive Privacy-Aware Reinforcement Learning for Sequential-Decision Making Human-in-the-Loop Systems

adaPARL: Adaptive Privacy-Aware Reinforcement Learning for Sequential-Decision Making Human-in-the-Loop Systems

URL: http://arxiv.org/abs/2303.04257v1
Date: Tue, 7 Mar 2023 21:55:22 GMT
Title: adaPARL: Adaptive Privacy-Aware Reinforcement Learning for Sequential-Decision Making Human-in-the-Loop Systems
Authors: Mojtaba Taherisadr, Stelios Andrew Stavroulakis, Salma Elmalaki
Abstract summary: Reinforcement learning (RL) presents numerous benefits compared to rule-based approaches in various applications. We propose adaPARL, an adaptive approach for privacy-aware RL, especially for human-in-the-loop IoT systems. AdaPARL provides a personalized privacy-utility trade-off depending on human behavior and preference.
Score: 0.5414308305392761
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reinforcement learning (RL) presents numerous benefits compared to rule-based approaches in various applications. Privacy concerns have grown with the widespread use of RL trained with privacy-sensitive data in IoT devices, especially for human-in-the-loop systems. On the one hand, RL methods enhance the user experience by trying to adapt to the highly dynamic nature of humans. On the other hand, trained policies can leak the user's private information. Recent attention has been drawn to designing privacy-aware RL algorithms while maintaining an acceptable system utility. A central challenge in designing privacy-aware RL, especially for human-in-the-loop systems, is that humans have intrinsic variability and their preferences and behavior evolve. The effect of one privacy leak mitigation can be different for the same human or across different humans over time. Hence, we can not design one fixed model for privacy-aware RL that fits all. To that end, we propose adaPARL, an adaptive approach for privacy-aware RL, especially for human-in-the-loop IoT systems. adaPARL provides a personalized privacy-utility trade-off depending on human behavior and preference. We validate the proposed adaPARL on two IoT applications, namely (i) Human-in-the-Loop Smart Home and (ii) Human-in-the-Loop Virtual Reality (VR) Smart Classroom. Results obtained on these two applications validate the generality of adaPARL and its ability to provide a personalized privacy-utility trade-off. On average, for the first application, adaPARL improves the utility by $57\%$ over the baseline and by $43\%$ over randomization. adaPARL also reduces the privacy leak by $23\%$ on average. For the second application, adaPARL decreases the privacy leak to $44\%$ before the utility drops by $15\%$.

Related papers

Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied. Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
PEaRL: Personalized Privacy of Human-Centric Systems using Early-Exit Reinforcement Learning [0.5317624228510748]
This paper introduces PEaRL, a system designed to enhance privacy preservation by tailoring its approach to individual behavioral patterns and preferences. On average, across both systems, PEaRL enhances privacy protection by 31%, with a corresponding utility reduction of 24%.
arXiv Detail & Related papers (2024-03-09T10:24:12Z)
Contrastive Preference Learning: Learning from Human Feedback without RL [71.77024922527642]
We introduce Contrastive Preference Learning (CPL), an algorithm for learning optimal policies from preferences without learning reward functions. CPL is fully off-policy, uses only a simple contrastive objective, and can be applied to arbitrary MDPs.
arXiv Detail & Related papers (2023-10-20T16:37:56Z)
Privacy Preserving Large Language Models: ChatGPT Case Study Based Vision and Framework [6.828884629694705]
This article proposes the conceptual model called PrivChatGPT, a privacy-generative model for LLMs. PrivChatGPT consists of two main components i.e., preserving user privacy during the data curation/pre-processing together with preserving private context and the private training process for large-scale data.
arXiv Detail & Related papers (2023-10-19T06:55:13Z)
TeD-SPAD: Temporal Distinctiveness for Self-supervised Privacy-preservation for video Anomaly Detection [59.04634695294402]
Video anomaly detection (VAD) without human monitoring is a complex computer vision task. Privacy leakage in VAD allows models to pick up and amplify unnecessary biases related to people's personal information. We propose TeD-SPAD, a privacy-aware video anomaly detection framework that destroys visual private information in a self-supervised manner.
arXiv Detail & Related papers (2023-08-21T22:42:55Z)
On Differential Privacy for Federated Learning in Wireless Systems with Multiple Base Stations [90.53293906751747]
We consider a federated learning model in a wireless system with multiple base stations and inter-cell interference. We show the convergence behavior of the learning process by deriving an upper bound on its optimality gap. Our proposed scheduler improves the average accuracy of the predictions compared with a random scheduler.
arXiv Detail & Related papers (2022-08-25T03:37:11Z)
Production of Categorical Data Verifying Differential Privacy: Conception and Applications to Machine Learning [0.0]
Differential privacy is a formal definition that allows quantifying the privacy-utility trade-off. With the local DP (LDP) model, users can sanitize their data locally before transmitting it to the server. In all cases, we concluded that differentially private ML models achieve nearly the same utility metrics as non-private ones.
arXiv Detail & Related papers (2022-04-02T12:50:14Z)
Privacy-Preserving Reinforcement Learning Beyond Expectation [6.495883501989546]
Cyber and cyber-physical systems equipped with machine learning algorithms such as autonomous cars share environments with humans. It is important to align system (or agent) behaviors with the preferences of one or more human users. We consider the case when an agent has to learn behaviors in an unknown environment.
arXiv Detail & Related papers (2022-03-18T21:28:29Z)
Adaptive Control of Differentially Private Linear Quadratic Systems [5.414308305392762]
We study the problem of regret in reinforcement learning (RL) under differential privacy constraints. We develop the first private RL algorithm, PRL, which is able to attain a sub-linear regret while guaranteeing privacy protection.
arXiv Detail & Related papers (2021-08-26T03:06:22Z)
Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL) We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP) We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.