(Private) Kernelized Bandits with Distributed Biased Feedback
- URL: http://arxiv.org/abs/2301.12061v1
- Date: Sat, 28 Jan 2023 02:30:15 GMT
- Title: (Private) Kernelized Bandits with Distributed Biased Feedback
- Authors: Fengjiao Li, Xingyu Zhou, and Bo Ji
- Abstract summary: We study kernelized bandits with distributed biased feedback.
New emphdistributed phase-then-batch-based elimination (textttDPBE) algorithm is proposed.
We show that textttDPBE achieves a sublinear regret of $tildeO(T1-alpha/2+sqrtgamma_T T)$, where $alphain (0,1)$ is the user-sampling parameter one can tune.
- Score: 13.312928989951505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study kernelized bandits with distributed biased feedback.
This problem is motivated by several real-world applications (such as dynamic
pricing, cellular network configuration, and policy making), where users from a
large population contribute to the reward of the action chosen by a central
entity, but it is difficult to collect feedback from all users. Instead, only
biased feedback (due to user heterogeneity) from a subset of users may be
available. In addition to such partial biased feedback, we are also faced with
two practical challenges due to communication cost and computation complexity.
To tackle these challenges, we carefully design a new \emph{distributed
phase-then-batch-based elimination (\texttt{DPBE})} algorithm, which samples
users in phases for collecting feedback to reduce the bias and employs
\emph{maximum variance reduction} to select actions in batches within each
phase. By properly choosing the phase length, the batch size, and the
confidence width used for eliminating suboptimal actions, we show that
\texttt{DPBE} achieves a sublinear regret of
$\tilde{O}(T^{1-\alpha/2}+\sqrt{\gamma_T T})$, where $\alpha\in (0,1)$ is the
user-sampling parameter one can tune. Moreover, \texttt{DPBE} can significantly
reduce both communication cost and computation complexity in distributed
kernelized bandits, compared to some variants of the state-of-the-art
algorithms (originally developed for standard kernelized bandits). Furthermore,
by incorporating various \emph{differential privacy} models (including the
central, local, and shuffle models), we generalize \texttt{DPBE} to provide
privacy guarantees for users participating in the distributed learning process.
Finally, we conduct extensive simulations to validate our theoretical results
and evaluate the empirical performance.
Related papers
- Smoothed Normalization for Efficient Distributed Private Optimization [54.197255548244705]
Federated learning enables machine learning models with privacy of participants.
There is no differentially private distributed method for training, non-feedback problems.
We introduce a new distributed algorithm $alpha$-$sf NormEC$ with provable convergence guarantees.
arXiv Detail & Related papers (2025-02-19T07:10:32Z) - Continuous K-Max Bandits [54.21533414838677]
We study the $K$-Max multi-armed bandits problem with continuous outcome distributions and weak value-index feedback.
This setting captures critical applications in recommendation systems, distributed computing, server scheduling, etc.
Our key contribution is the computationally efficient algorithm DCK-UCB, which combines adaptive discretization with bias-corrected confidence bounds.
arXiv Detail & Related papers (2025-02-19T06:37:37Z) - The Cost of Shuffling in Private Gradient Based Optimization [40.31928071333575]
We show that data shuffling results in worse empirical excess risk for textitDP-ShuffleG compared to DP-SGD.
We propose textitInterleaved-ShuffleG, a hybrid approach that integrates public data samples in private optimization.
arXiv Detail & Related papers (2025-02-05T22:30:00Z) - Personalized Denoising Implicit Feedback for Robust Recommender System [60.719158008403376]
We show that for a given user, there is a clear distinction between normal and noisy interactions in the user's personal loss distribution.
We propose a resampling strategy to Denoise using the user's Personal Loss distribution, named PLD, which reduces the probability of noisy interactions being optimized.
arXiv Detail & Related papers (2025-02-01T07:13:06Z) - Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Disincentivizing Polarization in Social Networks [10.758115514959593]
We present a model for content curation and personalization that avoids filter bubbles.
We provide algorithmic guarantees for optimizing recommendations.
Using real-world preference data, we verify that under our model, users share the burden of diversification with only minor utility loss.
arXiv Detail & Related papers (2023-05-23T21:47:31Z) - Discrete Distribution Estimation under User-level Local Differential
Privacy [37.65849910114053]
We study discrete distribution estimation under user-level local differential privacy (LDP)
In user-level $varepsilon$-LDP, each user has $mge1$ samples and the privacy of all $m$ samples must be preserved simultaneously.
arXiv Detail & Related papers (2022-11-07T18:29:32Z) - Distributed Sparse Regression via Penalization [5.990069843501885]
We study linear regression over a network of agents, modeled as an undirected graph (with no centralized node)
The estimation problem is formulated as the minimization of the sum of the local LASSO loss functions plus a quadratic penalty of the consensus constraint.
We show that the proximal-gradient algorithm applied to the penalized problem converges linearly up to a tolerance of the order of the centralized statistical error.
arXiv Detail & Related papers (2021-11-12T01:51:50Z) - Linear Speedup in Personalized Collaborative Learning [69.45124829480106]
Personalization in federated learning can improve the accuracy of a model for a user by trading off the model's bias.
We formalize the personalized collaborative learning problem as optimization of a user's objective.
We explore conditions under which we can optimally trade-off their bias for a reduction in variance.
arXiv Detail & Related papers (2021-11-10T22:12:52Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.