Collaborative Learning and Personalization in Multi-Agent Stochastic
Linear Bandits
- URL: http://arxiv.org/abs/2106.08902v1
- Date: Tue, 15 Jun 2021 00:45:55 GMT
- Title: Collaborative Learning and Personalization in Multi-Agent Stochastic
Linear Bandits
- Authors: Avishek Ghosh, Abishek Sankararaman and Kannan Ramchandran
- Abstract summary: We consider the problem of minimizing regret in an $N$ agent heterogeneous linear bandits framework, where the agents (users) are similar but not all identical.
We show that, for any agent, the regret scales as $mathcalO(sqrtT/N)$, if the agent is in a well separated' cluster, or scales as $mathcalO(Tfrac12 + varepsilon/(N)frac12 -varepsilon)$ if its cluster
- Score: 24.293155063082438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of minimizing regret in an $N$ agent heterogeneous
stochastic linear bandits framework, where the agents (users) are similar but
not all identical. We model user heterogeneity using two popularly used ideas
in practice; (i) A clustering framework where users are partitioned into groups
with users in the same group being identical to each other, but different
across groups, and (ii) a personalization framework where no two users are
necessarily identical, but a user's parameters are close to that of the
population average. In the clustered users' setup, we propose a novel
algorithm, based on successive refinement of cluster identities and regret
minimization. We show that, for any agent, the regret scales as
$\mathcal{O}(\sqrt{T/N})$, if the agent is in a `well separated' cluster, or
scales as $\mathcal{O}(T^{\frac{1}{2} + \varepsilon}/(N)^{\frac{1}{2}
-\varepsilon})$ if its cluster is not well separated, where $\varepsilon$ is
positive and arbitrarily close to $0$. Our algorithm is adaptive to the cluster
separation, and is parameter free -- it does not need to know the number of
clusters, separation and cluster size, yet the regret guarantee adapts to the
inherent complexity. In the personalization framework, we introduce a natural
algorithm where, the personal bandit instances are initialized with the
estimates of the global average model. We show that, an agent $i$ whose
parameter deviates from the population average by $\epsilon_i$, attains a
regret scaling of $\widetilde{O}(\epsilon_i\sqrt{T})$. This demonstrates that
if the user representations are close (small $\epsilon_i)$, the resulting
regret is low, and vice-versa. The results are empirically validated and we
observe superior performance of our adaptive algorithms over non-adaptive
baselines.
Related papers
- Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [69.15976031704687]
We propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability.
IAC maintains an overall computational complexity of $ mathcalO(n, textpolylog(n) $, making it scalable and practical for large-scale problems.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Optimal Algorithms for Latent Bandits with Cluster Structure [50.44722775727619]
We consider the problem of latent bandits with cluster structure where there are multiple users, each with an associated multi-armed bandit problem.
We propose LATTICE which allows exploitation of the latent cluster structure to provide the minimax optimal regret of $widetildeO(sqrt(mathsfM+mathsfN)mathsfT.
arXiv Detail & Related papers (2023-01-17T17:49:04Z) - Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback.
It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines.
We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z) - Personalized Federated Learning via Convex Clustering [72.15857783681658]
We propose a family of algorithms for personalized federated learning with locally convex user costs.
The proposed framework is based on a generalization of convex clustering in which the differences between different users' models are penalized.
arXiv Detail & Related papers (2022-02-01T19:25:31Z) - Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem.
We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z) - K-expectiles clustering [0.0]
We propose a novel partitioning clustering algorithm based on expectiles.
We suggest two schemes: fixed $tau$ clustering, and adaptive $tau$ clustering.
arXiv Detail & Related papers (2021-03-16T21:14:56Z) - Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA.
We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z) - Explainable $k$-Means and $k$-Medians Clustering [25.513261099927163]
We consider using a small decision tree to partition a data set into clusters, so that clusters can be characterized in a straightforward manner.
We show that popular top-down decision tree algorithms may lead to clusterings with arbitrarily large cost.
We design an efficient algorithm that produces explainable clusters using a tree with $k$ leaves.
arXiv Detail & Related papers (2020-02-28T04:21:53Z) - Query-Efficient Correlation Clustering [13.085439249887713]
Correlation clustering is arguably the most natural formulation of clustering.
A main drawback of correlation clustering is that it requires as input the $Theta(n2)$ pairwise similarities.
We devise a correlation clustering algorithm that attains a solution whose expected number of disagreements is at most $3cdot OPT + O(fracn3Q)$.
arXiv Detail & Related papers (2020-02-26T15:18:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.