MTFM: A Scalable and Alignment-free Foundation Model for Industrial Recommendation in Meituan
- URL: http://arxiv.org/abs/2602.11235v2
- Date: Fri, 13 Feb 2026 09:38:24 GMT
- Title: MTFM: A Scalable and Alignment-free Foundation Model for Industrial Recommendation in Meituan
- Authors: Xin Song, Zhilin Guan, Ruidong Han, Binghao Tang, Tianwen Chen, Bing Li, Zihao Li, Han Zhang, Fei Jiang, Qing Wang, Zikang Xu, Fengyi Li, Chunzhen Jing, Lei Yu, Wei Lin,
- Abstract summary: We propose MTFM (Meituan Foundation Model for Recommendation), a transformer-based framework that addresses these challenges.<n>Instead of pre-aligning inputs, MTFM transforms cross-domain data into heterogeneous tokens, capturing multi-scenario knowledge in an alignment-free manner.
- Score: 28.96814648857816
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Industrial recommendation systems typically involve multiple scenarios, yet existing cross-domain (CDR) and multi-scenario (MSR) methods often require prohibitive resources and strict input alignment, limiting their extensibility. We propose MTFM (Meituan Foundation Model for Recommendation), a transformer-based framework that addresses these challenges. Instead of pre-aligning inputs, MTFM transforms cross-domain data into heterogeneous tokens, capturing multi-scenario knowledge in an alignment-free manner. To enhance efficiency, we first introduce a multi-scenario user-level sample aggregation that significantly enhances training throughput by reducing the total number of instances. We further integrate Grouped-Query Attention and a customized Hybrid Target Attention to minimize memory usage and computational complexity. Furthermore, we implement various system-level optimizations, such as kernel fusion and the elimination of CPU-GPU blocking, to further enhance both training and inference throughput. Offline and online experiments validate the effectiveness of MTFM, demonstrating that significant performance gains are achieved by scaling both model capacity and multi-scenario training data.
Related papers
- FeDecider: An LLM-Based Framework for Federated Cross-Domain Recommendation [75.50721642765994]
Large language model (LLM)-based recommendation models have demonstrated impressive performance.<n>We propose an LLM-based framework for Federated cross-domain recommendation, FeDecider.<n>Extensive experiments across diverse datasets validate the effectiveness of our proposed FeDecider.
arXiv Detail & Related papers (2026-02-17T21:42:28Z) - MTDrive: Multi-turn Interactive Reinforcement Learning for Autonomous Driving [12.330414519761524]
Trajectory planning is a core task in autonomous driving.<n>MLLMs with Reinforcement Learning have shown promise in addressing "long-tail" scenarios.<n>We present MTDrive, a multi-turn framework that enables MLLMs to iteratively refine trajectories based on environmental feedback.
arXiv Detail & Related papers (2026-01-30T12:47:55Z) - Adapformer: Adaptive Channel Management for Multivariate Time Series Forecasting [49.40321003932633]
Adapformer is an advanced Transformer-based framework that merges the benefits of CI and CD methodologies through effective channel management.<n>Adapformer achieves superior performance over existing models, enhancing both predictive accuracy and computational efficiency.
arXiv Detail & Related papers (2025-11-18T16:24:05Z) - FindRec: Stein-Guided Entropic Flow for Multi-Modal Sequential Recommendation [57.577843653775]
We propose textbfFindRec (textbfFlexible unified textbfinformation textbfdisentanglement for multi-modal sequential textbfRecommendation)<n>A Stein kernel-based Integrated Information Coordination Module (IICM) theoretically guarantees distribution consistency between multimodal features and ID streams.<n>A cross-modal expert routing mechanism that adaptively filters and combines multimodal features based on their contextual relevance.
arXiv Detail & Related papers (2025-07-07T04:09:45Z) - Task-Oriented Multimodal Token Transmission in Resource-Constrained Multiuser Networks [19.42660454288912]
We propose a task-oriented multimodal token transmission scheme for efficient multimodal information fusion and utilization.<n>To improve the efficiency of token transmission, we design a two-stage training algotithm, including cross-modal alignment and task-oriented fine-tuning.<n>We jointly optimize bandwidth, power allocation, and token length across users by using an alternating optimization method.
arXiv Detail & Related papers (2025-05-06T14:17:05Z) - Single Domain Generalization with Model-aware Parametric Batch-wise Mixup [22.709796153794507]
Single Domain Generalization remains a formidable challenge in the field of machine learning.<n>We propose a novel data augmentation approach, named as Model-aware Parametric Batch-wise Mixup.<n>By exploiting inter-feature correlations, the parameterized mixup generator introduces additional versatility in combining features across a batch of instances.
arXiv Detail & Related papers (2025-02-22T03:45:18Z) - Multi-task Domain Adaptation for Computation Offloading in Edge-intelligence Networks [34.934911340540545]
This paper introduces a new approach, termed as Multi-Task Domain Adaptation (MTDA)<n>The proposed MTDA model incorporates a teacher-student architecture that allows continuous adaptation without necessitating access to the source domain data during inference.<n>It is observed that the proposed MTDA model maintains high performance across various scenarios, demonstrating its potential for practical deployment in emerging MEC applications.
arXiv Detail & Related papers (2025-01-02T13:20:29Z) - Prioritizing Modalities: Flexible Importance Scheduling in Federated Multimodal Learning [5.421492821020181]
Federated Learning (FL) is a distributed machine learning approach that enables devices to collaboratively train models without sharing their local data.
Applying FL to real-world data presents challenges, particularly as most existing FL research focuses on unimodal data.
We propose FlexMod, a novel approach to enhance computational efficiency in MFL by adaptively allocating training resources for each modality encoder.
arXiv Detail & Related papers (2024-08-13T01:14:27Z) - Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation [59.41178047749177]
We focus on multi-domain Neural Machine Translation, with the goal of developing efficient models which can handle data from various domains seen during training and are robust to domains unseen during training.
We hypothesize that Sparse Mixture-of-Experts (SMoE) models are a good fit for this task, as they enable efficient model scaling.
We conduct a series of experiments aimed at validating the utility of SMoE for the multi-domain scenario, and find that a straightforward width scaling of Transformer is a simpler and surprisingly more efficient approach in practice, and reaches the same performance level as SMoE.
arXiv Detail & Related papers (2024-07-01T09:45:22Z) - Retentive Decision Transformer with Adaptive Masking for Reinforcement Learning based Recommendation Systems [17.750449033873036]
Reinforcement Learning-based Recommender Systems (RLRS) have shown promise across a spectrum of applications.
Yet, they grapple with challenges, notably in crafting reward functions and harnessing large pre-existing datasets.
Recent advancements in offline RLRS provide a solution for how to address these two challenges.
arXiv Detail & Related papers (2024-03-26T12:08:58Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.