Demystifying Online Clustering of Bandits: Enhanced Exploration Under Stochastic and Smoothed Adversarial Contexts
- URL: http://arxiv.org/abs/2501.00891v1
- Date: Wed, 01 Jan 2025 16:38:29 GMT
- Title: Demystifying Online Clustering of Bandits: Enhanced Exploration Under Stochastic and Smoothed Adversarial Contexts
- Authors: Zhuohua Li, Maoli Liu, Xiangxiang Dai, John C. S. Lui,
- Abstract summary: A line of research, known as online clustering of bandits, extends contextual MAB by grouping similar users into clusters.
Existing algorithms, which rely on the upper confidence bound (UCB) strategy, struggle to gather adequate statistical information to accurately identify unknown user clusters.
We propose two novel algorithms, UniCLUB and PhaseUniCLUB, which incorporate enhanced exploration mechanisms to accelerate cluster identification.
- Score: 27.62165569135504
- License:
- Abstract: The contextual multi-armed bandit (MAB) problem is crucial in sequential decision-making. A line of research, known as online clustering of bandits, extends contextual MAB by grouping similar users into clusters, utilizing shared features to improve learning efficiency. However, existing algorithms, which rely on the upper confidence bound (UCB) strategy, struggle to gather adequate statistical information to accurately identify unknown user clusters. As a result, their theoretical analyses require several strong assumptions about the "diversity" of contexts generated by the environment, leading to impractical settings, complicated analyses, and poor practical performance. Removing these assumptions has been a long-standing open problem in the clustering of bandits literature. In this paper, we provide two solutions to this open problem. First, following the i.i.d. context generation setting in existing studies, we propose two novel algorithms, UniCLUB and PhaseUniCLUB, which incorporate enhanced exploration mechanisms to accelerate cluster identification. Remarkably, our algorithms require substantially weaker assumptions while achieving regret bounds comparable to prior work. Second, inspired by the smoothed analysis framework, we propose a more practical setting that eliminates the requirement for i.i.d. context generation used in previous studies, thus enhancing the performance of existing algorithms for online clustering of bandits. Our technique can be applied to both graph-based and set-based clustering of bandits frameworks. Extensive evaluations on both synthetic and real-world datasets demonstrate that our proposed algorithms consistently outperform existing approaches.
Related papers
- Online Clustering of Dueling Bandits [59.09590979404303]
We introduce the first "clustering of dueling bandit algorithms" to enable collaborative decision-making based on preference feedback.
We propose two novel algorithms: (1) Clustering of Linear Dueling Bandits (COLDB) which models the user reward functions as linear functions of the context vectors, and (2) Clustering of Neural Dueling Bandits (CONDB) which uses a neural network to model complex, non-linear user reward functions.
arXiv Detail & Related papers (2025-02-04T07:55:41Z) - Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective [52.662463893268225]
Self-supervised heterogeneous graph learning (SHGL) has shown promising potential in diverse scenarios.
Existing SHGL methods encounter two significant limitations.
We introduce a novel framework enhanced by rank and dual consistency constraints.
arXiv Detail & Related papers (2024-12-01T09:33:20Z) - Meta Clustering of Neural Bandits [45.77505279698894]
We study a new problem, Clustering of Neural Bandits, by extending previous work to the arbitrary reward function.
We propose a novel algorithm called M-CNB, which utilizes a meta-learner to represent and rapidly adapt to dynamic clusters.
In extensive experiments conducted in both recommendation and online classification scenarios, M-CNB outperforms SOTA baselines.
arXiv Detail & Related papers (2024-08-10T16:09:51Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Query-Efficient Correlation Clustering with Noisy Oracle [17.11782578276788]
We introduce two novel formulations of online learning problems rooted in the paradigm of Pure Exploration in Combinatorial Multi-Armed Bandits (PE-CMAB)
We design algorithms that combine a sampling strategy with a classic approximation algorithm for correlation and study their theoretical guarantees.
Our results are the first examples of clustering-time algorithms that work for the case of PE-CMAB in which the underlying offline optimization problem is NP-hard.
arXiv Detail & Related papers (2024-02-02T13:31:24Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Contextual Model Aggregation for Fast and Robust Federated Learning in
Edge Computing [88.76112371510999]
Federated learning is a prime candidate for distributed machine learning at the network edge.
Existing algorithms face issues with slow convergence and/or robustness of performance.
We propose a contextual aggregation scheme that achieves the optimal context-dependent bound on loss reduction.
arXiv Detail & Related papers (2022-03-23T21:42:31Z) - Fairness Degrading Adversarial Attacks Against Clustering Algorithms [35.40427659749882]
We propose a fairness degrading attack algorithm for k-median clustering.
We find that the addition of the generated adversarial samples can lead to significantly lower fairness values.
arXiv Detail & Related papers (2021-10-22T19:10:27Z) - Local Clustering in Contextual Multi-Armed Bandits [44.11480686973274]
We study identifying user clusters in contextual multi-armed bandits (MAB)
We propose a bandit algorithm, LOCB, embedded with local clustering procedure.
We evaluate the proposed algorithm from various aspects, which outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2021-02-26T21:59:29Z) - HAWKS: Evolving Challenging Benchmark Sets for Cluster Analysis [2.5329716878122404]
Comprehensive benchmarking of clustering algorithms is difficult.
There is no consensus regarding the best practice for rigorous benchmarking.
We demonstrate the important role evolutionary algorithms play to support flexible generation of such benchmarks.
arXiv Detail & Related papers (2021-02-13T15:01:34Z) - A black-box adversarial attack for poisoning clustering [78.19784577498031]
We propose a black-box adversarial attack for crafting adversarial samples to test the robustness of clustering algorithms.
We show that our attacks are transferable even against supervised algorithms such as SVMs, random forests, and neural networks.
arXiv Detail & Related papers (2020-09-09T18:19:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.