Controlling Federated Learning for Covertness
- URL: http://arxiv.org/abs/2308.08825v1
- Date: Thu, 17 Aug 2023 07:16:41 GMT
- Title: Controlling Federated Learning for Covertness
- Authors: Adit Jain and Vikram Krishnamurthy
- Abstract summary: A learner aims to minimize a function $f$ by repeatedly querying a distributed oracle that provides noisy gradient evaluations.
At the same time, the learner seeks to hide $argmin f$ from a malicious eavesdropper that observes the learner's queries.
This paper considers the problem of textitcovert or textitlearner-private optimization, where the learner has to dynamically choose between learning and obfuscation.
- Score: 15.878313629774269
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A learner aims to minimize a function $f$ by repeatedly querying a
distributed oracle that provides noisy gradient evaluations. At the same time,
the learner seeks to hide $\arg\min f$ from a malicious eavesdropper that
observes the learner's queries. This paper considers the problem of
\textit{covert} or \textit{learner-private} optimization, where the learner has
to dynamically choose between learning and obfuscation by exploiting the
stochasticity. The problem of controlling the stochastic gradient algorithm for
covert optimization is modeled as a Markov decision process, and we show that
the dynamic programming operator has a supermodular structure implying that the
optimal policy has a monotone threshold structure. A computationally efficient
policy gradient algorithm is proposed to search for the optimal querying policy
without knowledge of the transition probabilities. As a practical application,
our methods are demonstrated on a hate speech classification task in a
federated setting where an eavesdropper can use the optimal weights to generate
toxic content, which is more easily misclassified. Numerical results show that
when the learner uses the optimal policy, an eavesdropper can only achieve a
validation accuracy of $52\%$ with no information and $69\%$ when it has a
public dataset with 10\% positive samples compared to $83\%$ when the learner
employs a greedy policy.
Related papers
- Traversing Pareto Optimal Policies: Provably Efficient Multi-Objective Reinforcement Learning [14.260168974085376]
This paper investigates multi-objective reinforcement learning (MORL)
It focuses on learning optimal policies in the presence of multiple reward functions.
Despite MORL's success, there is still a lack of satisfactory understanding of various MORL optimization targets and efficient learning algorithms.
arXiv Detail & Related papers (2024-07-24T17:58:49Z) - Structured Reinforcement Learning for Incentivized Stochastic Covert Optimization [13.440621354486906]
A gradient algorithm (SG) can be controlled to hide the estimate of the local stationary point from an eavesdropper.
This paper studies how a gradient algorithm (SG) can be controlled to hide the estimate of the local stationary point from an eavesdropper.
arXiv Detail & Related papers (2024-05-13T01:29:48Z) - Super Non-singular Decompositions of Polynomials and their Application to Robustly Learning Low-degree PTFs [39.468324211376505]
We study the efficient learnability of low-degree threshold functions (PTFs) in the presence of a constant fraction of adversarial corruptions.
Our algorithm employs an iterative approach inspired by localization techniques previously used in the context of learning linear threshold functions.
arXiv Detail & Related papers (2024-03-31T02:03:35Z) - Target-based Surrogates for Stochastic Optimization [26.35752393302125]
We consider minimizing functions for which it is expensive to compute the (possibly) gradient.
Such functions are prevalent in computation reinforcement learning, imitation learning and adversarial training.
Our framework allows the use of standard optimization algorithms to construct surrogates which can be minimized efficiently.
arXiv Detail & Related papers (2023-02-06T08:08:34Z) - Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time
Guarantees [56.848265937921354]
Inverse reinforcement learning (IRL) aims to recover the reward function and the associated optimal policy.
Many algorithms for IRL have an inherently nested structure.
We develop a novel single-loop algorithm for IRL that does not compromise reward estimation accuracy.
arXiv Detail & Related papers (2022-10-04T17:13:45Z) - A Boosting Approach to Reinforcement Learning [59.46285581748018]
We study efficient algorithms for reinforcement learning in decision processes whose complexity is independent of the number of states.
We give an efficient algorithm that is capable of improving the accuracy of such weak learning methods.
arXiv Detail & Related papers (2021-08-22T16:00:45Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Randomized Exploration for Reinforcement Learning with General Value
Function Approximation [122.70803181751135]
We propose a model-free reinforcement learning algorithm inspired by the popular randomized least squares value iteration (RLSVI) algorithm.
Our algorithm drives exploration by simply perturbing the training data with judiciously chosen i.i.d. scalar noises.
We complement the theory with an empirical evaluation across known difficult exploration tasks.
arXiv Detail & Related papers (2021-06-15T02:23:07Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Provably Efficient Reward-Agnostic Navigation with Linear Value
Iteration [143.43658264904863]
We show how iteration under a more standard notion of low inherent Bellman error, typically employed in least-square value-style algorithms, can provide strong PAC guarantees on learning a near optimal value function.
We present a computationally tractable algorithm for the reward-free setting and show how it can be used to learn a near optimal policy for any (linear) reward function.
arXiv Detail & Related papers (2020-08-18T04:34:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.