Related papers: Expert with Clustering: Hierarchical Online Preference Learning Framework

Expert with Clustering: Hierarchical Online Preference Learning Framework

URL: http://arxiv.org/abs/2401.15062v2
Date: Mon, 24 Jun 2024 13:03:11 GMT
Title: Expert with Clustering: Hierarchical Online Preference Learning Framework
Authors: Tianyue Zhou, Jung-Hoon Cho, Babak Rahimi Ardabili, Hamed Tabkhi, Cathy Wu,
Abstract summary: Expert with Clustering (EWC) is a hierarchical contextual bandit framework that integrates clustering techniques and prediction with expert advice. EWC can substantially reduce regret by 27.57% compared to the LinUCB baseline.
Score: 4.05836962263239
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Emerging mobility systems are increasingly capable of recommending options to mobility users, to guide them towards personalized yet sustainable system outcomes. Even more so than the typical recommendation system, it is crucial to minimize regret, because 1) the mobility options directly affect the lives of the users, and 2) the system sustainability relies on sufficient user participation. In this study, we consider accelerating user preference learning by exploiting a low-dimensional latent space that captures the mobility preferences of users. We introduce a hierarchical contextual bandit framework named Expert with Clustering (EWC), which integrates clustering techniques and prediction with expert advice. EWC efficiently utilizes hierarchical user information and incorporates a novel Loss-guided Distance metric. This metric is instrumental in generating more representative cluster centroids. In a recommendation scenario with $N$ users, $T$ rounds per user, and $K$ options, our algorithm achieves a regret bound of $O(N\sqrt{T\log K} + NT)$. This bound consists of two parts: the first term is the regret from the Hedge algorithm, and the second term depends on the average loss from clustering. To the best of the authors knowledge, this is the first work to analyze the regret of an integrated expert algorithm with k-Means clustering. This regret bound underscores the theoretical and experimental efficacy of EWC, particularly in scenarios that demand rapid learning and adaptation. Experimental results highlight that EWC can substantially reduce regret by 27.57% compared to the LinUCB baseline. Our work offers a data-efficient approach to capturing both individual and collective behaviors, making it highly applicable to contexts with hierarchical structures. We expect the algorithm to be applicable to other settings with layered nuances of user preferences and information.

Related papers

STARec: An Efficient Agent Framework for Recommender Systems via Autonomous Deliberate Reasoning [54.28691219536054]
We introduce STARec, a slow-thinking augmented agent framework that endows recommender systems with autonomous deliberative reasoning capabilities.<n>We develop anchored reinforcement training - a two-stage paradigm combining structured knowledge distillation from advanced reasoning models with preference-aligned reward shaping.<n>Experiments on MovieLens 1M and Amazon CDs benchmarks demonstrate that STARec achieves substantial performance gains compared with state-of-the-art baselines.
arXiv Detail & Related papers (2025-08-26T08:47:58Z)
Hierarchical Group-wise Ranking Framework for Recommendation Models [0.0]
CTR/CVR models are increasingly trained with ranking objectives to improve item ranking quality.<n>Current methods rely on in-batch negative sampling, which predominantly surfaces easy negatives.<n>We propose a Hierarchical Group-wise Ranking Framework with two key components.
arXiv Detail & Related papers (2025-06-15T07:47:26Z)
Multi-agents based User Values Mining for Recommendation [52.26100802380767]
We propose a zero-shot multi-LLM collaborative framework for effective and accurate user value extraction.<n>We apply text summarization techniques to condense item content while preserving essential meaning.<n>To mitigate hallucinations, we introduce two specialized agent roles: evaluators and supervisors.
arXiv Detail & Related papers (2025-05-02T04:01:31Z)
Online Clustering of Dueling Bandits [59.09590979404303]
We introduce the first "clustering of dueling bandit algorithms" to enable collaborative decision-making based on preference feedback. We propose two novel algorithms: (1) Clustering of Linear Dueling Bandits (COLDB) which models the user reward functions as linear functions of the context vectors, and (2) Clustering of Neural Dueling Bandits (CONDB) which uses a neural network to model complex, non-linear user reward functions.
arXiv Detail & Related papers (2025-02-04T07:55:41Z)
The Nah Bandit: Modeling User Non-compliance in Recommendation Systems [2.421459418045937]
Expert with Clustering (EWC) is a hierarchical approach that incorporates feedback from both recommended and non-recommended options to accelerate user preference learning. EWC outperforms both supervised learning and traditional contextual bandit approaches. This work lays the foundation for future research in Nah Bandit, providing a robust framework for more effective recommendation systems.
arXiv Detail & Related papers (2024-08-15T03:01:02Z)
Retrieval Augmentation via User Interest Clustering [57.63883506013693]
Industrial recommender systems are sensitive to the patterns of user-item engagement. We propose a novel approach that efficiently constructs user interest and facilitates low computational cost inference. Our approach has been deployed in multiple products at Meta, facilitating short-form video related recommendation.
arXiv Detail & Related papers (2024-08-07T16:35:10Z)
A Machine Learning-Based Framework for Clustering Residential Electricity Load Profiles to Enhance Demand Response Programs [0.0]
We present a novel machine learning based framework in order to achieve optimal load profiling through a real case study. In this paper, we present a novel machine learning based framework in order to achieve optimal load profiling through a real case study.
arXiv Detail & Related papers (2023-10-31T11:23:26Z)
Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data. The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships. A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z)
ClusterSeq: Enhancing Sequential Recommender Systems with Clustering based Meta-Learning [3.168790535780547]
ClusterSeq is a Meta-Learning Clustering-Based Sequential Recommender System. It exploits dynamic information in the user sequence to enhance item prediction accuracy, even in the absence of side information. Our proposed approach achieves a substantial improvement of 16-39% in Mean Reciprocal Rank (MRR)
arXiv Detail & Related papers (2023-07-25T18:53:24Z)
Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems [38.75457258877731]
We present a framework for benchmarking the degree of manipulations of recommendation algorithms. We find that a high online click-through rate does not necessarily mean a better understanding of user initial preference. We advocate that future recommendation algorithm studies should be treated as an optimization problem with constrained user preference manipulations.
arXiv Detail & Related papers (2022-10-11T17:56:55Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Meta-Wrapper: Differentiable Wrapping Operator for User Interest Selection in CTR Prediction [97.99938802797377]
Click-through rate (CTR) prediction, whose goal is to predict the probability of the user to click on an item, has become increasingly significant in recommender systems. Recent deep learning models with the ability to automatically extract the user interest from his/her behaviors have achieved great success. We propose a novel approach under the framework of the wrapper method, which is named Meta-Wrapper.
arXiv Detail & Related papers (2022-06-28T03:28:15Z)
Meta Clustering Learning for Large-scale Unsupervised Person Re-identification [124.54749810371986]
We propose a "small data for big task" paradigm dubbed Meta Clustering Learning (MCL) MCL only pseudo-labels a subset of the entire unlabeled data via clustering to save computing for the first-phase training. Our method significantly saves computational cost while achieving a comparable or even better performance compared to prior works.
arXiv Detail & Related papers (2021-11-19T04:10:18Z)
Learning with Multiclass AUC: Theory and Algorithms [141.63211412386283]
Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems. In this paper, we start an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics.
arXiv Detail & Related papers (2021-07-28T05:18:10Z)
An Efficient Framework for Clustered Federated Learning [26.24231986590374]
We address the problem of federated learning (FL) where users are distributed into clusters. We propose the Iterative Federated Clustering Algorithm (IFCA) We show that our algorithm is efficient in non- partitioned problems such as neural networks.
arXiv Detail & Related papers (2020-06-07T08:48:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.