Crocodile: Cross Experts Covariance for Disentangled Learning in Multi-Domain Recommendation
- URL: http://arxiv.org/abs/2405.12706v2
- Date: Wed, 20 Nov 2024 09:35:09 GMT
- Title: Crocodile: Cross Experts Covariance for Disentangled Learning in Multi-Domain Recommendation
- Authors: Zhutian Lin, Junwei Pan, Haibin Yu, Xi Xiao, Ximei Wang, Zhixiang Feng, Shifeng Wen, Shudong Huang, Dapeng Liu, Lei Xiao,
- Abstract summary: We propose a novel Cross-experts Covariance Loss for Disentangled Learning model (Crocodile)
It employs multiple embedding tables to make the model domain-aware at the embeddings which consist most parameters in the model.
Crocodile achieves 0.72% CTR lift and 0.73% GMV lift on a primary advertising scenario.
- Score: 23.010588147317623
- License:
- Abstract: Multi-domain learning (MDL) has become a prominent topic in enhancing the quality of personalized services. It's critical to learn commonalities between domains and preserve the distinct characteristics of each domain. However, this leads to a challenging dilemma in MDL. On the one hand, a model needs to leverage domain-aware modules such as experts or embeddings to preserve each domain's distinctiveness. On the other hand, real-world datasets often exhibit long-tailed distributions across domains, where some domains may lack sufficient samples to effectively train their specific modules. Unfortunately, nearly all existing work falls short of resolving this dilemma. To this end, we propose a novel Cross-experts Covariance Loss for Disentangled Learning model (Crocodile), which employs multiple embedding tables to make the model domain-aware at the embeddings which consist most parameters in the model, and a covariance loss upon these embeddings to disentangle them, enabling the model to capture diverse user interests among domains. Empirical analysis demonstrates that our method successfully addresses both challenges and outperforms all state-of-the-art methods on public datasets. During online A/B testing in Tencent's advertising platform, Crocodile achieves 0.72% CTR lift and 0.73% GMV lift on a primary advertising scenario.
Related papers
- FISC: Federated Domain Generalization via Interpolative Style Transfer and Contrastive Learning [5.584498171854557]
Federated Learning (FL) shows promise in preserving privacy and enabling collaborative learning.
We introduce FISC, a novel FL domain generalization paradigm that handles more complex domain distributions across clients.
Our method achieves accuracy improvements ranging from 3.64% to 57.22% on unseen domains.
arXiv Detail & Related papers (2024-10-30T00:50:23Z) - MLoRA: Multi-Domain Low-Rank Adaptive Network for CTR Prediction [18.524017579108044]
We propose a Multi-domain Low-Rank Adaptive network (MLoRA) for CTR prediction, where we introduce a specialized LoRA module for each domain.
Experimental results demonstrate our MLoRA approach achieves a significant improvement compared with state-of-the-art baselines.
The code of our MLoRA is publicly available.
arXiv Detail & Related papers (2024-08-14T05:53:02Z) - Decoupled Training: Return of Frustratingly Easy Multi-Domain Learning [20.17925272562433]
Multi-domain learning aims to train a model with minimal average risk across multiple overlapping but non-identical domains.
We propose Decoupled Training (D-Train) as a frustratingly easy and hyper parameter-free multi-domain learning method.
D-Train is a tri-phase general-to-specific training strategy that first pre-trains on all domains to warm up a root model, then post-trains on each domain by splitting into multi-heads, and finally fine-tunes the heads by fixing the backbone.
arXiv Detail & Related papers (2023-09-19T04:06:41Z) - Multi-Domain Long-Tailed Learning by Augmenting Disentangled
Representations [80.76164484820818]
There is an inescapable long-tailed class-imbalance issue in many real-world classification problems.
We study this multi-domain long-tailed learning problem and aim to produce a model that generalizes well across all classes and domains.
Built upon a proposed selective balanced sampling strategy, TALLY achieves this by mixing the semantic representation of one example with the domain-associated nuisances of another.
arXiv Detail & Related papers (2022-10-25T21:54:26Z) - ME-D2N: Multi-Expert Domain Decompositional Network for Cross-Domain
Few-Shot Learning [95.78635058475439]
Cross-Domain Few-Shot Learning aims at addressing the Few-Shot Learning problem across different domains.
This paper technically contributes a novel Multi-Expert Domain Decompositional Network (ME-D2N)
We present a novel domain decomposition module that learns to decompose the student model into two domain-related sub parts.
arXiv Detail & Related papers (2022-10-11T09:24:47Z) - Forget Less, Count Better: A Domain-Incremental Self-Distillation
Learning Benchmark for Lifelong Crowd Counting [51.44987756859706]
Off-the-shelf methods have some drawbacks to handle multiple domains.
Lifelong Crowd Counting aims at alleviating the catastrophic forgetting and improving the generalization ability.
arXiv Detail & Related papers (2022-05-06T15:37:56Z) - META: Mimicking Embedding via oThers' Aggregation for Generalizable
Person Re-identification [68.39849081353704]
Domain generalizable (DG) person re-identification (ReID) aims to test across unseen domains without access to the target domain data at training time.
This paper presents a new approach called Mimicking Embedding via oThers' Aggregation (META) for DG ReID.
arXiv Detail & Related papers (2021-12-16T08:06:50Z) - Generalizable Person Re-identification with Relevance-aware Mixture of
Experts [45.13716166680772]
We propose a novel method called the relevance-aware mixture of experts (RaMoE)
RaMoE uses an effective voting-based mixture mechanism to dynamically leverage source domains' diverse characteristics to improve the model's generalization.
Considering the target domains' invisibility during training, we propose a novel learning-to-learn algorithm combined with our relation alignment loss to update the voting network.
arXiv Detail & Related papers (2021-05-19T14:19:34Z) - Inferring Latent Domains for Unsupervised Deep Domain Adaptation [54.963823285456925]
Unsupervised Domain Adaptation (UDA) refers to the problem of learning a model in a target domain where labeled data are not available.
This paper introduces a novel deep architecture which addresses the problem of UDA by automatically discovering latent domains in visual datasets.
We evaluate our approach on publicly available benchmarks, showing that it outperforms state-of-the-art domain adaptation methods.
arXiv Detail & Related papers (2021-03-25T14:33:33Z) - Learning to Combine: Knowledge Aggregation for Multi-Source Domain
Adaptation [56.694330303488435]
We propose a Learning to Combine for Multi-Source Domain Adaptation (LtC-MSDA) framework.
In the nutshell, a knowledge graph is constructed on the prototypes of various domains to realize the information propagation among semantically adjacent representations.
Our approach outperforms existing methods with a remarkable margin.
arXiv Detail & Related papers (2020-07-17T07:52:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.