Related papers: MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert Specialization

MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert Specialization

URL: http://arxiv.org/abs/2506.07563v3
Date: Wed, 11 Jun 2025 07:55:32 GMT
Title: MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert Specialization
Authors: Ken Yaggel, Eyal German, Aviel Ben Siman Tov,
Abstract summary: MoE-MLoRA is a mixture-of-experts framework where each expert is first trained independently to specialize in its domain.<n>We evaluate MoE-MLoRA across eight CTR models on Movielens and Taobao.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Personalized recommendation systems must adapt to user interactions across different domains. Traditional approaches like MLoRA apply a single adaptation per domain but lack flexibility in handling diverse user behaviors. To address this, we propose MoE-MLoRA, a mixture-of-experts framework where each expert is first trained independently to specialize in its domain before a gating network is trained to weight their contributions dynamically. We evaluate MoE-MLoRA across eight CTR models on Movielens and Taobao, showing that it improves performance in large-scale, dynamic datasets (+1.45 Weighed-AUC in Taobao-20) but offers limited benefits in structured datasets with low domain diversity and sparsity. Further analysis of the number of experts per domain reveals that larger ensembles do not always improve performance, indicating the need for model-aware tuning. Our findings highlight the potential of expert-based architectures for multi-domain recommendation systems, demonstrating that task-aware specialization and adaptive gating can enhance predictive accuracy in complex environments. The implementation and code are available in our GitHub repository.

Related papers

Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models [5.466962214217334]
Supervised fine-tuning (SFT) is widely used to align large language models (LLMs) with information extraction (IE) tasks, such as named entity recognition (NER)<n>We propose the SaM framework, which dynamically Selects and Merges expert models at inference time.
arXiv Detail & Related papers (2025-06-28T08:28:52Z)
Large Language Model Empowered Recommendation Meets All-domain Continual Pre-Training [60.38082979765664]
CPRec is an All-domain Continual Pre-Training framework for Recommendation.<n>It holistically align LLMs with universal user behaviors through the continual pre-training paradigm.<n>We conduct experiments on five real-world datasets from two distinct platforms.
arXiv Detail & Related papers (2025-04-11T20:01:25Z)
Adaptive Domain Scaling for Personalized Sequential Modeling in Recommenders [19.237606439035364]
We present Adaptive Domain Scaling (ADS) model, which comprehensively enhances the personalization capability in target-aware sequence modeling.<n>ADS comprises of two major modules, including personalized sequence representation generation (PSRG) and personalized candidate representation generation (PCRG)<n>Experiments are performed on both a public dataset and two billion-scaled industrial datasets, and the extensive results verify the high effectiveness and compatibility of ADS.
arXiv Detail & Related papers (2025-02-08T11:05:22Z)
LFME: A Simple Framework for Learning from Multiple Experts in Domain Generalization [61.16890890570814]
Domain generalization (DG) methods aim to maintain good performance in an unseen target domain by using training data from multiple source domains. This work introduces a simple yet effective framework, dubbed learning from multiple experts (LFME) that aims to make the target model an expert in all source domains to improve DG.
arXiv Detail & Related papers (2024-10-22T13:44:10Z)
Scalable Multi-Domain Adaptation of Language Models using Modular Experts [10.393155077703653]
MoDE is a mixture-of-experts architecture that augments a general PLM with modular, domain-specialized experts. MoDE achieves comparable target performances to full parameter fine-tuning while achieving 1.65% better retention performance.
arXiv Detail & Related papers (2024-10-14T06:02:56Z)
MLoRA: Multi-Domain Low-Rank Adaptive Network for CTR Prediction [18.524017579108044]
We propose a Multi-domain Low-Rank Adaptive network (MLoRA) for CTR prediction, where we introduce a specialized LoRA module for each domain. Experimental results demonstrate our MLoRA approach achieves a significant improvement compared with state-of-the-art baselines. The code of our MLoRA is publicly available.
arXiv Detail & Related papers (2024-08-14T05:53:02Z)
Flexible and Adaptable Summarization via Expertise Separation [59.26639426529827]
A proficient summarization model should exhibit both flexibility and adaptability. We propose MoeSumm, a Mixture-of-Expert Summarization architecture. Our model's distinct separation of general and domain-specific summarization abilities grants it with notable flexibility and adaptability.
arXiv Detail & Related papers (2024-06-08T05:31:19Z)
M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework [32.68911775382326]
M3oE is an adaptive Multi-domain Multi-task Mixture-of-Experts recommendation framework. We leverage three mixture-of-experts modules to learn common, domain-aspect, and task-aspect user preferences. We design a two-level fusion mechanism for precise control over feature extraction and fusion across diverse domains and tasks.
arXiv Detail & Related papers (2024-04-29T06:59:30Z)
Harder Tasks Need More Experts: Dynamic Routing in MoE Models [58.18526590138739]
We introduce a novel dynamic expert selection framework for Mixture of Experts (MoE) models. Our method dynamically selects experts based on the confidence level in expert selection for each input.
arXiv Detail & Related papers (2024-03-12T13:41:15Z)
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts [74.40198929049959]
Large multi-modal models (LMMs) exhibit remarkable performance across numerous tasks. generalist LMMs often suffer from performance degradation when tuned over a large collection of tasks. We propose Omni-SMoLA, an architecture that uses the Soft MoE approach to mix many multimodal low rank experts.
arXiv Detail & Related papers (2023-12-01T23:04:27Z)
Multiple Expert Brainstorming for Domain Adaptive Person Re-identification [140.3998019639158]
We propose a multiple expert brainstorming network (MEB-Net) for domain adaptive person re-ID. MEB-Net adopts a mutual learning strategy, where multiple networks with different architectures are pre-trained within a source domain. Experiments on large-scale datasets demonstrate the superior performance of MEB-Net over the state-of-the-arts.
arXiv Detail & Related papers (2020-07-03T08:16:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.