Automatic Expert Selection for Multi-Scenario and Multi-Task Search
- URL: http://arxiv.org/abs/2205.14321v1
- Date: Sat, 28 May 2022 03:41:25 GMT
- Title: Automatic Expert Selection for Multi-Scenario and Multi-Task Search
- Authors: Xinyu Zou, Zhi Hu, Yiming Zhao, Xuchu Ding, Zhongyi Liu, Chenliang Li,
Aixin Sun
- Abstract summary: We propose a novel Automatic Expert Selection framework for Multi-scenario and Multi-task search, named AESM2.
Experiments over two real-world large-scale datasets demonstrate the effectiveness of AESM2 over a battery of strong baselines.
Online A/B test also shows substantial performance gain on multiple metrics.
- Score: 41.47107282896807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-scenario learning (MSL) enables a service provider to cater for users'
fine-grained demands by separating services for different user sectors, e.g.,
by user's geographical region. Under each scenario there is a need to optimize
multiple task-specific targets e.g., click through rate and conversion rate,
known as multi-task learning (MTL). Recent solutions for MSL and MTL are mostly
based on the multi-gate mixture-of-experts (MMoE) architecture. MMoE structure
is typically static and its design requires domain-specific knowledge, making
it less effective in handling both MSL and MTL. In this paper, we propose a
novel Automatic Expert Selection framework for Multi-scenario and Multi-task
search, named AESM^{2}. AESM^{2} integrates both MSL and MTL into a unified
framework with an automatic structure learning. Specifically, AESM^{2} stacks
multi-task layers over multi-scenario layers. This hierarchical design enables
us to flexibly establish intrinsic connections between different scenarios, and
at the same time also supports high-level feature extraction for different
tasks. At each multi-scenario/multi-task layer, a novel expert selection
algorithm is proposed to automatically identify scenario-/task-specific and
shared experts for each input. Experiments over two real-world large-scale
datasets demonstrate the effectiveness of AESM^{2} over a battery of strong
baselines. Online A/B test also shows substantial performance gain on multiple
metrics. Currently, AESM^{2} has been deployed online for serving major
traffic.
Related papers
- AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Giving each task what it needs -- leveraging structured sparsity for tailored multi-task learning [4.462334751640166]
In the Multi-task Learning (MTL) framework, every task demands distinct feature representations, ranging from low-level to high-level attributes.
This work introduces Layer-d Multi-Task models that utilize structured sparsity to refine feature selection for individual tasks and enhance the performance of all tasks in a multi-task scenario.
arXiv Detail & Related papers (2024-06-05T08:23:38Z) - A Framework to Implement 1+N Multi-task Fine-tuning Pattern in LLMs
Using the CGC-LORA Algorithm [7.521690071464451]
We propose a unified framework that implements a 1 + N mutli-task fine-tuning pattern in large language models (LLMs)
Our work aims to take an advantage of both MTL (i.e., CGC) and PEFT (i.e., LoRA) scheme.
arXiv Detail & Related papers (2024-01-22T07:58:31Z) - Task-Based MoE for Multitask Multilingual Machine Translation [58.20896429151824]
Mixture-of-experts (MoE) architecture has been proven a powerful method for diverse tasks in training deep models in many applications.
In this work, we design a novel method that incorporates task information into MoE models at different granular levels with shared dynamic task-based adapters.
arXiv Detail & Related papers (2023-08-30T05:41:29Z) - Knowledge Assembly: Semi-Supervised Multi-Task Learning from Multiple
Datasets with Disjoint Labels [8.816979799419107]
Multi-Task Learning (MTL) is an adequate method to do so, but usually requires datasets labeled for all tasks.
We propose a method that can leverage datasets labeled for only some of the tasks in the MTL framework.
Our work, Knowledge Assembly (KA), learns multiple tasks from disjoint datasets by leveraging the unlabeled data in a semi-supervised manner.
arXiv Detail & Related papers (2023-06-15T04:05:03Z) - HiNet: Novel Multi-Scenario & Multi-Task Learning with Hierarchical Information Extraction [50.40732146978222]
Multi-scenario & multi-task learning has been widely applied to many recommendation systems in industrial applications.
We propose a Hierarchical information extraction Network (HiNet) for multi-scenario and multi-task recommendation.
HiNet achieves a new state-of-the-art performance and significantly outperforms existing solutions.
arXiv Detail & Related papers (2023-03-10T17:24:41Z) - M$^3$ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task
Learning with Model-Accelerator Co-design [95.41238363769892]
Multi-task learning (MTL) encapsulates multiple learned tasks in a single model and often lets those tasks learn better jointly.
Current MTL regimes have to activate nearly the entire model even to just execute a single task.
We present a model-accelerator co-design framework to enable efficient on-device MTL.
arXiv Detail & Related papers (2022-10-26T15:40:24Z) - Heterogeneous Multi-task Learning with Expert Diversity [15.714385295889944]
We introduce an approach to induce more diversity among experts, thus creating representations more suitable for highly imbalanced and heterogenous MTL learning.
We validate our method on three MTL benchmark datasets, including Medical Information Mart for Intensive Care (MIMIC-III) and PubChem BioAssay (PCBA)
arXiv Detail & Related papers (2021-06-20T01:30:37Z) - Controllable Pareto Multi-Task Learning [55.945680594691076]
A multi-task learning system aims at solving multiple related tasks at the same time.
With a fixed model capacity, the tasks would be conflicted with each other, and the system usually has to make a trade-off among learning all of them together.
This work proposes a novel controllable multi-task learning framework, to enable the system to make real-time trade-off control among different tasks with a single model.
arXiv Detail & Related papers (2020-10-13T11:53:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.