(Private) Kernelized Bandits with Distributed Biased Feedback
- URL: http://arxiv.org/abs/2301.12061v1
- Date: Sat, 28 Jan 2023 02:30:15 GMT
- Title: (Private) Kernelized Bandits with Distributed Biased Feedback
- Authors: Fengjiao Li, Xingyu Zhou, and Bo Ji
- Abstract summary: We study kernelized bandits with distributed biased feedback.
New emphdistributed phase-then-batch-based elimination (textttDPBE) algorithm is proposed.
We show that textttDPBE achieves a sublinear regret of $tildeO(T1-alpha/2+sqrtgamma_T T)$, where $alphain (0,1)$ is the user-sampling parameter one can tune.
- Score: 13.312928989951505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study kernelized bandits with distributed biased feedback.
This problem is motivated by several real-world applications (such as dynamic
pricing, cellular network configuration, and policy making), where users from a
large population contribute to the reward of the action chosen by a central
entity, but it is difficult to collect feedback from all users. Instead, only
biased feedback (due to user heterogeneity) from a subset of users may be
available. In addition to such partial biased feedback, we are also faced with
two practical challenges due to communication cost and computation complexity.
To tackle these challenges, we carefully design a new \emph{distributed
phase-then-batch-based elimination (\texttt{DPBE})} algorithm, which samples
users in phases for collecting feedback to reduce the bias and employs
\emph{maximum variance reduction} to select actions in batches within each
phase. By properly choosing the phase length, the batch size, and the
confidence width used for eliminating suboptimal actions, we show that
\texttt{DPBE} achieves a sublinear regret of
$\tilde{O}(T^{1-\alpha/2}+\sqrt{\gamma_T T})$, where $\alpha\in (0,1)$ is the
user-sampling parameter one can tune. Moreover, \texttt{DPBE} can significantly
reduce both communication cost and computation complexity in distributed
kernelized bandits, compared to some variants of the state-of-the-art
algorithms (originally developed for standard kernelized bandits). Furthermore,
by incorporating various \emph{differential privacy} models (including the
central, local, and shuffle models), we generalize \texttt{DPBE} to provide
privacy guarantees for users participating in the distributed learning process.
Finally, we conduct extensive simulations to validate our theoretical results
and evaluate the empirical performance.
Related papers
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - Distributed Extra-gradient with Optimal Complexity and Communication
Guarantees [60.571030754252824]
We consider monotone variational inequality (VI) problems in multi-GPU settings where multiple processors/workers/clients have access to local dual vectors.
Extra-gradient, which is a de facto algorithm for monotone VI problems, has not been designed to be communication-efficient.
We propose a quantized generalized extra-gradient (Q-GenX), which is an unbiased and adaptive compression method tailored to solve VIs.
arXiv Detail & Related papers (2023-08-17T21:15:04Z) - Disincentivizing Polarization in Social Networks [10.758115514959593]
We present a model for content curation and personalization that avoids filter bubbles.
We provide algorithmic guarantees for optimizing recommendations.
Using real-world preference data, we verify that under our model, users share the burden of diversification with only minor utility loss.
arXiv Detail & Related papers (2023-05-23T21:47:31Z) - Discrete Distribution Estimation under User-level Local Differential
Privacy [37.65849910114053]
We study discrete distribution estimation under user-level local differential privacy (LDP)
In user-level $varepsilon$-LDP, each user has $mge1$ samples and the privacy of all $m$ samples must be preserved simultaneously.
arXiv Detail & Related papers (2022-11-07T18:29:32Z) - Learning Explicit User Interest Boundary for Recommendation [5.715918678913698]
We introduce an auxiliary score $b_u$ for each user to represent the User Interest Boundary.
We show that our approach can provide a personalized decision boundary and significantly improve the training efficiency without any special sampling strategy.
arXiv Detail & Related papers (2021-11-22T07:26:51Z) - Distributed Sparse Regression via Penalization [5.990069843501885]
We study linear regression over a network of agents, modeled as an undirected graph (with no centralized node)
The estimation problem is formulated as the minimization of the sum of the local LASSO loss functions plus a quadratic penalty of the consensus constraint.
We show that the proximal-gradient algorithm applied to the penalized problem converges linearly up to a tolerance of the order of the centralized statistical error.
arXiv Detail & Related papers (2021-11-12T01:51:50Z) - Linear Speedup in Personalized Collaborative Learning [69.45124829480106]
Personalization in federated learning can improve the accuracy of a model for a user by trading off the model's bias.
We formalize the personalized collaborative learning problem as optimization of a user's objective.
We explore conditions under which we can optimally trade-off their bias for a reduction in variance.
arXiv Detail & Related papers (2021-11-10T22:12:52Z) - Batched Neural Bandits [107.5072688105936]
BatchNeuralUCB combines neural networks with optimism to address the exploration-exploitation tradeoff.
We prove that BatchNeuralUCB achieves the same regret as the fully sequential version while reducing the number of policy updates considerably.
arXiv Detail & Related papers (2021-02-25T17:36:44Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.