Related papers: Privacy-Preserving Teacher-Student Deep Reinforcement Learning

Privacy-Preserving Teacher-Student Deep Reinforcement Learning

URL: http://arxiv.org/abs/2102.09599v1
Date: Thu, 18 Feb 2021 20:15:09 GMT
Title: Privacy-Preserving Teacher-Student Deep Reinforcement Learning
Authors: Parham Gohari, Bo Chen, Bo Wu, Matthew Hale, and Ufuk Topcu
Abstract summary: We develop a private mechanism that protects the privacy of the teacher's training dataset. We empirically show that the algorithm improves the student's learning upon convergence rate and utility.
Score: 23.934121758649052
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep reinforcement learning agents may learn complex tasks more efficiently when they coordinate with one another. We consider a teacher-student coordination scheme wherein an agent may ask another agent for demonstrations. Despite the benefits of sharing demonstrations, however, potential adversaries may obtain sensitive information belonging to the teacher by observing the demonstrations. In particular, deep reinforcement learning algorithms are known to be vulnerable to membership attacks, which make accurate inferences about the membership of the entries of training datasets. Therefore, there is a need to safeguard the teacher against such privacy threats. We fix the teacher's policy as the context of the demonstrations, which allows for different internal models across the student and the teacher, and contrasts the existing methods. We make the following two contributions. (i) We develop a differentially private mechanism that protects the privacy of the teacher's training dataset. (ii) We propose a proximal policy-optimization objective that enables the student to benefit from the demonstrations despite the perturbations of the privacy mechanism. We empirically show that the algorithm improves the student's learning upon convergence rate and utility. Specifically, compared with an agent who learns the same task on its own, we observe that the student's policy converges faster, and the converging policy accumulates higher rewards more robustly.

Related papers

Adaptive Teaching in Heterogeneous Agents: Balancing Surprise in Sparse Reward Scenarios [3.638198517970729]
Learning from Demonstration can be an efficient way to train systems with analogous agents. However, naively replicating demonstrations that are out of bounds for the Student's capability can limit efficient learning. We present a Teacher-Student learning framework specifically tailored to address the challenge of heterogeneity between the Teacher and Student agents.
arXiv Detail & Related papers (2024-05-23T05:52:42Z)
Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts [81.37287967870589]
We propose to harness a diverse set of specialized teachers, instead of a single generalist one, that collectively supervises the strong student. Our approach resembles the classical hierarchical mixture of experts, with two components tailored for co-supervision. We validate the proposed method through visual recognition tasks on the OpenAI weak-to-strong benchmark and additional multi-domain datasets.
arXiv Detail & Related papers (2024-02-23T18:56:11Z)
Students Parrot Their Teachers: Membership Inference on Model Distillation [54.392069096234074]
We study the privacy provided by knowledge distillation to both the teacher and student training sets. Our attacks are strongest when student and teacher sets are similar, or when the attacker can poison the teacher set.
arXiv Detail & Related papers (2023-03-06T19:16:23Z)
Guarded Policy Optimization with Imperfect Online Demonstrations [32.22880650876471]
Teacher-Student Framework is a reinforcement learning setting where a teacher agent guards the training of a student agent. It is expensive or even impossible to obtain a well-performing teacher policy. We develop a new method that can incorporate arbitrary teacher policies with modest or inferior performance.
arXiv Detail & Related papers (2023-03-03T06:24:04Z)
Explainable Action Advising for Multi-Agent Reinforcement Learning [32.49380192781649]
Action advising is a knowledge transfer technique for reinforcement learning based on the teacher-student paradigm. We introduce Explainable Action Advising, in which the teacher provides action advice as well as associated explanations indicating why the action was chosen. This allows the student to self-reflect on what it has learned, enabling generalization advice and leading to improved sample efficiency and learning performance.
arXiv Detail & Related papers (2022-11-15T04:15:03Z)
Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning [114.9857000195174]
A major challenge to widespread industrial adoption of deep reinforcement learning is the potential vulnerability to privacy breaches. We propose an adversarial attack framework tailored for testing the vulnerability of deep reinforcement learning algorithms to membership inference attacks.
arXiv Detail & Related papers (2021-09-08T23:44:57Z)
FaceLeaks: Inference Attacks against Transfer Learning Models via Black-box Queries [2.7564955518050693]
We investigate if one can leak or infer private information without interacting with the teacher model directly. We propose novel strategies to infer from aggregate-level information. Our study indicates that information leakage is a real privacy threat to the transfer learning framework widely used in real-life situations.
arXiv Detail & Related papers (2020-10-27T03:02:40Z)
Feature Distillation With Guided Adversarial Contrastive Learning [41.28710294669751]
We propose Guided Adversarial Contrastive Distillation (GACD) to transfer adversarial robustness from teacher to student with features. With a well-trained teacher model as an anchor, students are expected to extract features similar to the teacher. With GACD, the student not only learns to extract robust features, but also captures structural knowledge from the teacher.
arXiv Detail & Related papers (2020-09-21T14:46:17Z)
Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model. We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance. For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
Dual Policy Distillation [58.43610940026261]
Policy distillation, which transfers a teacher policy to a student policy, has achieved great success in challenging tasks of deep reinforcement learning. In this work, we introduce dual policy distillation(DPD), a student-student framework in which two learners operate on the same environment to explore different perspectives of the environment. The key challenge in developing this dual learning framework is to identify the beneficial knowledge from the peer learner for contemporary learning-based reinforcement learning algorithms.
arXiv Detail & Related papers (2020-06-07T06:49:47Z)
Differentially Private Deep Learning with Smooth Sensitivity [144.31324628007403]
We study privacy concerns through the lens of differential privacy. In this framework, privacy guarantees are generally obtained by perturbing models in such a way that specifics of data used to train the model are made ambiguous. One of the most important techniques used in previous works involves an ensemble of teacher models, which return information to a student based on a noisy voting procedure. In this work, we propose a novel voting mechanism with smooth sensitivity, which we call Immutable Noisy ArgMax, that, under certain conditions, can bear very large random noising from the teacher without affecting the useful information transferred to the student
arXiv Detail & Related papers (2020-03-01T15:38:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.