Related papers: (Private) Kernelized Bandits with Distributed Biased Feedback

(Private) Kernelized Bandits with Distributed Biased Feedback

URL: http://arxiv.org/abs/2301.12061v1
Date: Sat, 28 Jan 2023 02:30:15 GMT
Title: (Private) Kernelized Bandits with Distributed Biased Feedback
Authors: Fengjiao Li, Xingyu Zhou, and Bo Ji
Abstract summary: We study kernelized bandits with distributed biased feedback. New emphdistributed phase-then-batch-based elimination (textttDPBE) algorithm is proposed. We show that textttDPBE achieves a sublinear regret of $tildeO(T1-alpha/2+sqrtgamma_T T)$, where $alphain (0,1)$ is the user-sampling parameter one can tune.
Score: 13.312928989951505
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we study kernelized bandits with distributed biased feedback. This problem is motivated by several real-world applications (such as dynamic pricing, cellular network configuration, and policy making), where users from a large population contribute to the reward of the action chosen by a central entity, but it is difficult to collect feedback from all users. Instead, only biased feedback (due to user heterogeneity) from a subset of users may be available. In addition to such partial biased feedback, we are also faced with two practical challenges due to communication cost and computation complexity. To tackle these challenges, we carefully design a new \emph{distributed phase-then-batch-based elimination (\texttt{DPBE})} algorithm, which samples users in phases for collecting feedback to reduce the bias and employs \emph{maximum variance reduction} to select actions in batches within each phase. By properly choosing the phase length, the batch size, and the confidence width used for eliminating suboptimal actions, we show that \texttt{DPBE} achieves a sublinear regret of $\tilde{O}(T^{1-\alpha/2}+\sqrt{\gamma_T T})$, where $\alpha\in (0,1)$ is the user-sampling parameter one can tune. Moreover, \texttt{DPBE} can significantly reduce both communication cost and computation complexity in distributed kernelized bandits, compared to some variants of the state-of-the-art algorithms (originally developed for standard kernelized bandits). Furthermore, by incorporating various \emph{differential privacy} models (including the central, local, and shuffle models), we generalize \texttt{DPBE} to provide privacy guarantees for users participating in the distributed learning process. Finally, we conduct extensive simulations to validate our theoretical results and evaluate the empirical performance.

Related papers

Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive. LCD can distort the global distribution over strings, sampling tokens based only on local information. We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z)
Smoothed Normalization for Efficient Distributed Private Optimization [54.197255548244705]
Federated learning enables machine learning models with privacy of participants. There is no differentially private distributed method for training, non-feedback problems. We introduce a new distributed algorithm $alpha$-$sf NormEC$ with provable convergence guarantees.
arXiv Detail & Related papers (2025-02-19T07:10:32Z)
Continuous K-Max Bandits [54.21533414838677]
We study the $K$-Max multi-armed bandits problem with continuous outcome distributions and weak value-index feedback. This setting captures critical applications in recommendation systems, distributed computing, server scheduling, etc. Our key contribution is the computationally efficient algorithm DCK-UCB, which combines adaptive discretization with bias-corrected confidence bounds.
arXiv Detail & Related papers (2025-02-19T06:37:37Z)
The Cost of Shuffling in Private Gradient Based Optimization [40.31928071333575]
We show that data shuffling results in worse empirical excess risk for textitDP-ShuffleG compared to DP-SGD. We propose textitInterleaved-ShuffleG, a hybrid approach that integrates public data samples in private optimization.
arXiv Detail & Related papers (2025-02-05T22:30:00Z)
Personalized Denoising Implicit Feedback for Robust Recommender System [60.719158008403376]
We show that for a given user, there is a clear distinction between normal and noisy interactions in the user's personal loss distribution. We propose a resampling strategy to Denoise using the user's Personal Loss distribution, named PLD, which reduces the probability of noisy interactions being optimized.
arXiv Detail & Related papers (2025-02-01T07:13:06Z)
Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems. We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Distributed Extra-gradient with Optimal Complexity and Communication Guarantees [60.571030754252824]
We consider monotone variational inequality (VI) problems in multi-GPU settings where multiple processors/workers/clients have access to local dual vectors. Extra-gradient, which is a de facto algorithm for monotone VI problems, has not been designed to be communication-efficient. We propose a quantized generalized extra-gradient (Q-GenX), which is an unbiased and adaptive compression method tailored to solve VIs.
arXiv Detail & Related papers (2023-08-17T21:15:04Z)
Disincentivizing Polarization in Social Networks [10.758115514959593]
We present a model for content curation and personalization that avoids filter bubbles. We provide algorithmic guarantees for optimizing recommendations. Using real-world preference data, we verify that under our model, users share the burden of diversification with only minor utility loss.
arXiv Detail & Related papers (2023-05-23T21:47:31Z)
Discrete Distribution Estimation under User-level Local Differential Privacy [37.65849910114053]
We study discrete distribution estimation under user-level local differential privacy (LDP) In user-level $varepsilon$-LDP, each user has $mge1$ samples and the privacy of all $m$ samples must be preserved simultaneously.
arXiv Detail & Related papers (2022-11-07T18:29:32Z)
Learning Explicit User Interest Boundary for Recommendation [5.715918678913698]
We introduce an auxiliary score $b_u$ for each user to represent the User Interest Boundary. We show that our approach can provide a personalized decision boundary and significantly improve the training efficiency without any special sampling strategy.
arXiv Detail & Related papers (2021-11-22T07:26:51Z)
Distributed Sparse Regression via Penalization [5.990069843501885]
We study linear regression over a network of agents, modeled as an undirected graph (with no centralized node) The estimation problem is formulated as the minimization of the sum of the local LASSO loss functions plus a quadratic penalty of the consensus constraint. We show that the proximal-gradient algorithm applied to the penalized problem converges linearly up to a tolerance of the order of the centralized statistical error.
arXiv Detail & Related papers (2021-11-12T01:51:50Z)
Linear Speedup in Personalized Collaborative Learning [69.45124829480106]
Personalization in federated learning can improve the accuracy of a model for a user by trading off the model's bias. We formalize the personalized collaborative learning problem as optimization of a user's objective. We explore conditions under which we can optimally trade-off their bias for a reduction in variance.
arXiv Detail & Related papers (2021-11-10T22:12:52Z)
Batched Neural Bandits [107.5072688105936]
BatchNeuralUCB combines neural networks with optimism to address the exploration-exploitation tradeoff. We prove that BatchNeuralUCB achieves the same regret as the fully sequential version while reducing the number of policy updates considerably.
arXiv Detail & Related papers (2021-02-25T17:36:44Z)
Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution. We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.