Related papers: C2AL: Cohort-Contrastive Auxiliary Learning for Large-scale Recommendation Systems

C2AL: Cohort-Contrastive Auxiliary Learning for Large-scale Recommendation Systems

URL: http://arxiv.org/abs/2510.02215v2
Date: Fri, 03 Oct 2025 15:45:09 GMT
Title: C2AL: Cohort-Contrastive Auxiliary Learning for Large-scale Recommendation Systems
Authors: Mertcan Cokbas, Ziteng Liu, Zeyi Tao, Elder Veliz, Qin Huang, Ellie Wen, Huayu Li, Qiang Jin, Murat Duman, Benjamin Au, Guy Lebanon, Sagar Chordia, Chengkai Zhang,
Abstract summary: We show how the attention mechanism can play a key role in factorization machines for shared embedding selection.<n>We propose to address this challenge by analyzing the substructures in the dataset and exposing those with strong distributional contrast through auxiliary learning.<n>This approach customizes the learning process of attention layers to preserve mutual information with minority cohorts while improving global performance.
Score: 7.548682352355034
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Training large-scale recommendation models under a single global objective implicitly assumes homogeneity across user populations. However, real-world data are composites of heterogeneous cohorts with distinct conditional distributions. As models increase in scale and complexity and as more data is used for training, they become dominated by central distribution patterns, neglecting head and tail regions. This imbalance limits the model's learning ability and can result in inactive attention weights or dead neurons. In this paper, we reveal how the attention mechanism can play a key role in factorization machines for shared embedding selection, and propose to address this challenge by analyzing the substructures in the dataset and exposing those with strong distributional contrast through auxiliary learning. Unlike previous research, which heuristically applies weighted labels or multi-task heads to mitigate such biases, we leverage partially conflicting auxiliary labels to regularize the shared representation. This approach customizes the learning process of attention layers to preserve mutual information with minority cohorts while improving global performance. We evaluated C2AL on massive production datasets with billions of data points each for six SOTA models. Experiments show that the factorization machine is able to capture fine-grained user-ad interactions using the proposed method, achieving up to a 0.16% reduction in normalized entropy overall and delivering gains exceeding 0.30% on targeted minority cohorts.

Related papers

Enhancing Federated Learning Through Secure Cluster-Weighted Client Aggregation [4.869042695112397]
Federated learning (FL) has emerged as a promising paradigm in machine learning.<n>In FL, a global model is trained iteratively on local datasets residing on individual devices.<n>This paper introduces a novel FL framework, ClusterGuardFL, that employs dissimilarity scores, k-means clustering, and reconciliation confidence scores to dynamically assign weights to client updates.
arXiv Detail & Related papers (2025-03-29T04:29:24Z)
You Are Your Own Best Teacher: Achieving Centralized-level Performance in Federated Learning under Heterogeneous and Long-tailed Data [54.56492110703343]
Data heterogeneity, stemming from local non-IID data and global long-tailed distributions, is a major challenge in federated learning (FL)<n>We propose FedYoYo to improve representation learning by distilling knowledge between weakly and strongly augmented local samples.<n>We show FedYoYo achieves state-of-the-art results, even surpassing centralized logit adjustment methods by 5.4% under global long-tailed settings.
arXiv Detail & Related papers (2025-03-10T04:57:20Z)
Hybrid-Regularized Magnitude Pruning for Robust Federated Learning under Covariate Shift [2.298932494750101]
We show that inconsistencies in client-side training distributions substantially degrade the performance of federated learning models.<n>We propose a novel FL framework using a combination of pruning and regularisation of clients' training to improve the sparsity, redundancy, and robustness of neural connections.
arXiv Detail & Related papers (2024-12-19T16:22:37Z)
Reducing Spurious Correlation for Federated Domain Generalization [15.864230656989854]
In open-world scenarios, global models may struggle to predict well on entirely new domain data captured by certain media. Existing methods still rely on strong statistical correlations between samples and labels to address this issue. We introduce FedCD, an overall optimization framework at both the local and global levels.
arXiv Detail & Related papers (2024-07-27T05:06:31Z)
Federated cINN Clustering for Accurate Clustered Federated Learning [33.72494731516968]
Federated Learning (FL) presents an innovative approach to privacy-preserving distributed machine learning. We propose the Federated cINN Clustering Algorithm (FCCA) to robustly cluster clients into different groups.
arXiv Detail & Related papers (2023-09-04T10:47:52Z)
Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers. We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes. We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z)
Evaluating and Incentivizing Diverse Data Contributions in Collaborative Learning [89.21177894013225]
For a federated learning model to perform well, it is crucial to have a diverse and representative dataset. We show that the statistical criterion used to quantify the diversity of the data, as well as the choice of the federated learning algorithm used, has a significant effect on the resulting equilibrium. We leverage this to design simple optimal federated learning mechanisms that encourage data collectors to contribute data representative of the global population.
arXiv Detail & Related papers (2023-06-08T23:38:25Z)
Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning [60.41501515192088]
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively. The data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model. In this work, we integrate the local real data with the global gradient prototypes to form the local balanced datasets.
arXiv Detail & Related papers (2023-01-25T03:18:10Z)
Auxo: Efficient Federated Learning via Scalable Client Clustering [22.323057948281644]
Federated learning (FL) enables edge devices to collaboratively train ML models without revealing their raw data to a logically centralized server. We propose Auxo to gradually identify clients with statistically similar data distributions (cohorts) in large-scale, low-availability, and resource-constrained FL populations. We show Auxo boosts various existing FL solutions in terms of final accuracy (2.1% - 8.2%), convergence time (up to 2.2x), and model bias (4.8% - 53.8%)
arXiv Detail & Related papers (2022-10-29T17:36:51Z)
DRFLM: Distributionally Robust Federated Learning with Inter-client Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data. We propose a general framework to solve the above two challenges simultaneously. We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z)
Cohort Bias Adaptation in Aggregated Datasets for Lesion Segmentation [0.8466401378239363]
We propose a generalized affine conditioning framework to learn and account for cohort biases across multi-source datasets. We show that our cohort bias adaptation method improves performance of the network on pooled datasets.
arXiv Detail & Related papers (2021-08-02T08:32:57Z)
Multi-Center Federated Learning [62.57229809407692]
This paper proposes a novel multi-center aggregation mechanism for federated learning. It learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers. Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.
arXiv Detail & Related papers (2020-05-03T09:14:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.