Harnessing Data from Clustered LQR Systems: Personalized and Collaborative Policy Optimization
- URL: http://arxiv.org/abs/2511.17489v1
- Date: Fri, 21 Nov 2025 18:45:53 GMT
- Title: Harnessing Data from Clustered LQR Systems: Personalized and Collaborative Policy Optimization
- Authors: Vinay Kanakeri, Shivam Bajaj, Ashwin Verma, Vijay Gupta, Aritra Mitra,
- Abstract summary: We show how clustering can be used in data-driven control to learn personalized policies.<n>Our work is the first to reveal how clustering can be used in data-driven control to learn personalized policies.
- Score: 4.770545528276722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is known that reinforcement learning (RL) is data-hungry. To improve sample-efficiency of RL, it has been proposed that the learning algorithm utilize data from 'approximately similar' processes. However, since the process models are unknown, identifying which other processes are similar poses a challenge. In this work, we study this problem in the context of the benchmark Linear Quadratic Regulator (LQR) setting. Specifically, we consider a setting with multiple agents, each corresponding to a copy of a linear process to be controlled. The agents' local processes can be partitioned into clusters based on similarities in dynamics and tasks. Combining ideas from sequential elimination and zeroth-order policy optimization, we propose a new algorithm that performs simultaneous clustering and learning to output a personalized policy (controller) for each cluster. Under a suitable notion of cluster separation that captures differences in closed-loop performance across systems, we prove that our approach guarantees correct clustering with high probability. Furthermore, we show that the sub-optimality gap of the policy learned for each cluster scales inversely with the size of the cluster, with no additional bias, unlike in prior works on collaborative learning-based control. Our work is the first to reveal how clustering can be used in data-driven control to learn personalized policies that enjoy statistical gains from collaboration but do not suffer sub-optimality due to inclusion of data from dissimilar processes. From a distributed implementation perspective, our method is attractive as it incurs only a mild logarithmic communication overhead.
Related papers
- RCC-PFL: Robust Client Clustering under Noisy Labels in Personalized Federated Learning [8.37314799155978]
We propose a label-agnostic data similarity-based clustering algorithm, coined RCC-PFL.<n>We show that it outperforms multiple baselines in terms of average accuracy and variance reduction.
arXiv Detail & Related papers (2025-03-25T17:50:54Z) - Data Similarity-Based One-Shot Clustering for Multi-Task Hierarchical Federated Learning [8.37314799155978]
We propose a one-shot clustering algorithm that can effectively identify and group users based on their data similarity.
Our proposed algorithm not only enhances the clustering process, but also overcomes challenges related to privacy concerns, communication overhead, and the need for prior knowledge about learning models or loss function behaviors.
arXiv Detail & Related papers (2024-10-03T17:51:21Z) - End-to-end Learnable Clustering for Intent Learning in Recommendation [54.157784572994316]
We propose a novel intent learning method termed underlineELCRec.
It unifies behavior representation learning into an underlineEnd-to-end underlineLearnable underlineClustering framework.
We deploy this method on the industrial recommendation system with 130 million page views and achieve promising results.
arXiv Detail & Related papers (2024-01-11T15:22:55Z) - A Weighted K-Center Algorithm for Data Subset Selection [70.49696246526199]
Subset selection is a fundamental problem that can play a key role in identifying smaller portions of the training data.
We develop a novel factor 3-approximation algorithm to compute subsets based on the weighted sum of both k-center and uncertainty sampling objective functions.
arXiv Detail & Related papers (2023-12-17T04:41:07Z) - Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering.
In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework.
In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z) - Hard Regularization to Prevent Deep Online Clustering Collapse without
Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed.
While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster.
We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z) - Dynamic Clustering and Cluster Contrastive Learning for Unsupervised
Person Re-identification [29.167783500369442]
Unsupervised Re-ID methods aim at learning robust and discriminative features from unlabeled data.
We propose a dynamic clustering and cluster contrastive learning (DCCC) method.
Experiments on several widely used public datasets validate the effectiveness of our proposed DCCC.
arXiv Detail & Related papers (2023-03-13T01:56:53Z) - A One-shot Framework for Distributed Clustered Learning in Heterogeneous
Environments [54.172993875654015]
The paper proposes a family of communication efficient methods for distributed learning in heterogeneous environments.
One-shot approach, based on local computations at the users and a clustering based aggregation step at the server is shown to provide strong learning guarantees.
For strongly convex problems it is shown that, as long as the number of data points per user is above a threshold, the proposed approach achieves order-optimal mean-squared error rates in terms of the sample size.
arXiv Detail & Related papers (2022-09-22T09:04:10Z) - Personalized Federated Learning via Convex Clustering [72.15857783681658]
We propose a family of algorithms for personalized federated learning with locally convex user costs.
The proposed framework is based on a generalization of convex clustering in which the differences between different users' models are penalized.
arXiv Detail & Related papers (2022-02-01T19:25:31Z) - Very Compact Clusters with Structural Regularization via Similarity and
Connectivity [3.779514860341336]
We propose an end-to-end deep clustering algorithm, i.e., Very Compact Clusters (VCC) for the general datasets.
Our proposed approach achieves better clustering performance over most of the state-of-the-art clustering methods.
arXiv Detail & Related papers (2021-06-09T23:22:03Z) - You Never Cluster Alone [150.94921340034688]
We extend the mainstream contrastive learning paradigm to a cluster-level scheme, where all the data subjected to the same cluster contribute to a unified representation.
We define a set of categorical variables as clustering assignment confidence, which links the instance-level learning track with the cluster-level one.
By reparametrizing the assignment variables, TCC is trained end-to-end, requiring no alternating steps.
arXiv Detail & Related papers (2021-06-03T14:59:59Z) - Unsupervised Visual Representation Learning by Online Constrained
K-Means [44.38989920488318]
Cluster discrimination is an effective pretext task for unsupervised representation learning.
We propose a novel clustering-based pretext task with online textbfConstrained textbfK-mtextbfeans (textbfCoKe)
Our online assignment method has a theoretical guarantee to approach the global optimum.
arXiv Detail & Related papers (2021-05-24T20:38:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.