Related papers: MPFormer: Adaptive Framework for Industrial Multi-Task Personalized Sequential Retriever

MPFormer: Adaptive Framework for Industrial Multi-Task Personalized Sequential Retriever

URL: http://arxiv.org/abs/2508.20400v1
Date: Thu, 28 Aug 2025 03:53:55 GMT
Title: MPFormer: Adaptive Framework for Industrial Multi-Task Personalized Sequential Retriever
Authors: Yijia Sun, Shanshan Huang, Linxiao Che, Haitao Lu, Qiang Luo, Kun Gai, Guorui Zhou,
Abstract summary: MPFormer is a dynamic multi-task Transformer framework for industrial recommendation systems.<n>It is successfully integrated into Kuaishou short video recommendation system, serving over 400 million daily active users.
Score: 22.507173183511153
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern industrial recommendation systems encounter a core challenge of multi-stage optimization misalignment: a significant semantic gap exists between the multi-objective optimization paradigm widely used in the ranking phase and the single-objective modeling in the retrieve phase. Although the mainstream industry solution achieves multi-objective coverage through parallel multi-path single-objective retrieval, this approach leads to linear growth of training and serving resources with the number of objectives and has inherent limitations in handling loosely coupled objectives. This paper proposes the MPFormer, a dynamic multi-task Transformer framework, which systematically addresses the aforementioned issues through three innovative mechanisms. First, an objective-conditioned transformer that jointly encodes user behavior sequences and multi-task semantics through learnable attention modulation; second, personalized target weights are introduced to achieve dynamic adjustment of retrieval results; finally, user personalization information is incorporated into token representations and the Transformer structure to further enhance the model's representation ability. This framework has been successfully integrated into Kuaishou short video recommendation system, stably serving over 400 million daily active users. It significantly improves user daily engagement and system operational efficiency. Practical deployment verification shows that, compared with traditional solutions, it effectively optimizes the iterative paradigm of multi-objective retrieval while maintaining service response speed, providing a scalable multi-objective solution for industrial recommendation systems.

Related papers

Refer-Agent: A Collaborative Multi-Agent System with Reasoning and Reflection for Referring Video Object Segmentation [50.22481337087162]
Referring Video Object (RVOS) aims to segment objects in videos based on textual queries.<n>Refer-Agent is a collaborative multi-agent system with alternating reasoning-reflection mechanisms.
arXiv Detail & Related papers (2026-02-03T14:48:12Z)
SMOG: Scalable Meta-Learning for Multi-Objective Bayesian Optimization [2.318371621318972]
We propose a scalable and modular meta-learning model based on a multi-output Gaussian process that explicitly learns correlations between objectives.<n> SMOG supports hierarchical, parallel training: meta-task Gaussian processes are fit once and then cached, achieving linear scaling with the number of meta-tasks.
arXiv Detail & Related papers (2026-01-29T18:51:58Z)
In-Context Multi-Objective Optimization [24.738414334054358]
TAMO is a fully amortized, universal policy for multi-objective black-box optimization.<n>We present TAMO, a fully amortized, universal policy for multi-objective black-box optimization.
arXiv Detail & Related papers (2025-12-11T20:56:42Z)
Multi-Agent Pointer Transformer: Seq-to-Seq Reinforcement Learning for Multi-Vehicle Dynamic Pickup-Delivery Problems [17.3780399150554]
This paper proposes an end-to-end centralized decision-making framework based on sequence-to-sequence, named Multi-Agent Pointer Transformer (MAPT)<n>MAPT significantly outperforms existing baseline methods in terms of performance and substantial computational time advantages compared to classical operations research methods.
arXiv Detail & Related papers (2025-11-21T17:32:10Z)
A Unified Multi-Task Learning Framework for Generative Auto-Bidding with Validation-Aligned Optimization [51.27959658504722]
Multi-task learning offers a principled framework to train these tasks jointly through shared representations.<n>Existing multi-task optimization strategies are primarily guided by training dynamics and often generalize poorly in volatile bidding environments.<n>We present Validation-Aligned Multi-task Optimization (VAMO), which adaptively assigns task weights based on the alignment between per-task training gradients and a held-out validation gradient.
arXiv Detail & Related papers (2025-10-09T03:59:51Z)
Multi-Modal Manipulation via Multi-Modal Policy Consensus [62.49978559936122]
We propose a new approach to integrate diverse sensory modalities for robotic manipulation.<n>Our method factorizes the policy into a set of diffusion models, each specialized for a single representation.<n>We evaluate our approach on simulated manipulation tasks in RLBench, as well as real-world tasks such as occluded object picking, in-hand spoon reorientation, and puzzle insertion.
arXiv Detail & Related papers (2025-09-27T19:43:04Z)
OneCAT: Decoder-Only Auto-Regressive Model for Unified Understanding and Generation [91.45421429922506]
OneCAT is a unified multimodal model that seamlessly integrates understanding, generation, and editing.<n>Our framework eliminates the need for external components such as Vision Transformers (ViT) or vision tokenizer during inference.
arXiv Detail & Related papers (2025-09-03T17:29:50Z)
Towards Agentic AI for Multimodal-Guided Video Object Segmentation [14.877182670778284]
Referring-based Video Object is a multimodal problem that requires producing fine-grained segmentation results guided by external cues.<n>Recent advances in vision-language foundation models open a promising direction toward training-free approaches.<n>We propose Multi-Modal Agent, a novel agentic system designed to solve this task in a more flexible and adaptive manner.
arXiv Detail & Related papers (2025-08-14T12:11:15Z)
ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding [71.654781631463]
ReAgent-V is a novel agentic video understanding framework.<n>It integrates efficient frame selection with real-time reward generation during inference.<n>Extensive experiments on 12 datasets demonstrate significant gains in generalization and reasoning.
arXiv Detail & Related papers (2025-06-02T04:23:21Z)
CARIn: Constraint-Aware and Responsive Inference on Heterogeneous Devices for Single- and Multi-DNN Workloads [4.556037016746581]
This article addresses the challenges inherent in optimising the execution of deep neural networks (DNNs) on mobile devices. We introduce CARIn, a novel framework designed for the optimised deployment of both single- and multi-DNN applications. We observe a substantial enhancement in the fair treatment of the problem's objectives, reaching 1.92x when compared to single-model designs and up to 10.69x in contrast to the state-of-the-art OODIn framework.
arXiv Detail & Related papers (2024-09-02T09:18:11Z)
Pareto Front Approximation for Multi-Objective Session-Based Recommender Systems [0.0]
MultiTRON is an approach that adapts approximation techniques to multi-objective session-based recommender systems.<n>Our approach optimize trade-offs between key metrics such as click-through and conversion rates by training on sampled preference vectors.<n>We validate the model's performance through extensive offline and online evaluation.
arXiv Detail & Related papers (2024-07-23T20:38:23Z)
LiMAML: Personalization of Deep Recommender Models via Meta Learning [13.69036196446634]
We introduce an innovative meta-learning solution tailored to the personalization of models for individual members and other entities. We leverage the Model-Agnostic Meta Learning (MAML) algorithm to adapt per-task sub-networks using recent user interaction data. Our approach has enabled the deployment of a range of highly personalized AI models across diverse LinkedIn applications.
arXiv Detail & Related papers (2024-02-23T22:06:36Z)
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment [46.44464839353993]
We introduce Rewards-in-Context (RiC), which conditions the response of a foundation model on multiple rewards in its prompt context. RiC only requires supervised fine-tuning of a single foundation model and supports dynamic adjustment for user preferences during inference time.
arXiv Detail & Related papers (2024-02-15T18:58:31Z)
MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation [61.45986275328629]
We propose MISSRec, a multi-modal pre-training and transfer learning framework for sequential recommendation. On the user side, we design a Transformer-based encoder-decoder model, where the contextual encoder learns to capture the sequence-level multi-modal user interests. On the candidate item side, we adopt a dynamic fusion module to produce user-adaptive item representation.
arXiv Detail & Related papers (2023-08-22T04:06:56Z)
Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks. We find that their performances are sub-optimal or even lag far behind the single-task baseline. We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z)
Scalable Video Object Segmentation with Identification Mechanism [125.4229430216776]
This paper explores the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object (VOS) We present two innovative approaches, Associating Objects with Transformers (AOT) and Associating Objects with Scalable Transformers (AOST) Our approaches surpass the state-of-the-art competitors and display exceptional efficiency and scalability consistently across all six benchmarks.
arXiv Detail & Related papers (2022-03-22T03:33:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.