Related papers: LiMAML: Personalization of Deep Recommender Models via Meta Learning

LiMAML: Personalization of Deep Recommender Models via Meta Learning

URL: http://arxiv.org/abs/2403.00803v1
Date: Fri, 23 Feb 2024 22:06:36 GMT
Title: LiMAML: Personalization of Deep Recommender Models via Meta Learning
Authors: Ruofan Wang, Prakruthi Prabhakar, Gaurav Srivastava, Tianqi Wang, Zeinab S. Jalali, Varun Bharill, Yunbo Ouyang, Aastha Nigam, Divya Venugopalan, Aman Gupta, Fedor Borisyuk, Sathiya Keerthi, Ajith Muralidharan
Abstract summary: We introduce an innovative meta-learning solution tailored to the personalization of models for individual members and other entities. We leverage the Model-Agnostic Meta Learning (MAML) algorithm to adapt per-task sub-networks using recent user interaction data. Our approach has enabled the deployment of a range of highly personalized AI models across diverse LinkedIn applications.
Score: 13.69036196446634
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In the realm of recommender systems, the ubiquitous adoption of deep neural networks has emerged as a dominant paradigm for modeling diverse business objectives. As user bases continue to expand, the necessity of personalization and frequent model updates have assumed paramount significance to ensure the delivery of relevant and refreshed experiences to a diverse array of members. In this work, we introduce an innovative meta-learning solution tailored to the personalization of models for individual members and other entities, coupled with the frequent updates based on the latest user interaction signals. Specifically, we leverage the Model-Agnostic Meta Learning (MAML) algorithm to adapt per-task sub-networks using recent user interaction data. Given the near infeasibility of productionizing original MAML-based models in online recommendation systems, we propose an efficient strategy to operationalize meta-learned sub-networks in production, which involves transforming them into fixed-sized vectors, termed meta embeddings, thereby enabling the seamless deployment of models with hundreds of billions of parameters for online serving. Through extensive experimentation on production data drawn from various applications at LinkedIn, we demonstrate that the proposed solution consistently outperforms the baseline models of those applications, including strong baselines such as using wide-and-deep ID based personalization approach. Our approach has enabled the deployment of a range of highly personalized AI models across diverse LinkedIn applications, leading to substantial improvements in business metrics as well as refreshed experience for our members.

Related papers

Hypernetworks for Model-Heterogeneous Personalized Federated Learning [13.408669475480824]
We propose a server-side hypernetwork that takes client-specific embedding vectors as input and outputs personalized parameters tailored to each client's heterogeneous model.<n>To promote knowledge sharing and reduce computation, we introduce a multi-head structure within the hypernetwork, allowing clients with similar model sizes to share heads.<n>Our framework does not rely on external datasets and does not require disclosure of client model architectures.
arXiv Detail & Related papers (2025-07-30T02:24:26Z)
Unifying Multimodal Large Language Model Capabilities and Modalities via Model Merging [103.98582374569789]
Model merging aims to combine multiple expert models into a single model, thereby reducing storage and serving costs.<n>Previous studies have primarily focused on merging visual classification models or Large Language Models (LLMs) for code and math tasks.<n>We introduce the model merging benchmark for MLLMs, which includes multiple tasks such as VQA, Geometry, Chart, OCR, and Grounding, providing both LoRA and full fine-tuning models.
arXiv Detail & Related papers (2025-05-26T12:23:14Z)
Enhancing User Intent for Recommendation Systems via Large Language Models [0.0]
DUIP is a novel framework that combines LSTM networks with Large Language Models (LLMs) to dynamically capture user intent and generate personalized item recommendations. Our findings suggest that DUIP is a promising approach for next-generation recommendation systems, with potential for further improvements in cross-modal recommendations and scalability.
arXiv Detail & Related papers (2025-01-18T20:35:03Z)
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains [114.76612918465948]
Large language models (LLMs) have achieved remarkable performance in recent years but are fundamentally limited by the underlying training data. We propose a complementary approach towards self-improvement where finetuning is applied to a multiagent society of language models.
arXiv Detail & Related papers (2025-01-10T04:35:46Z)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications. Ensuring their alignment with the diverse preferences of individual users has become a critical challenge. We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons [85.99268361356832]
We introduce a process of adapting an MLLM to a Generalist Embodied Agent (GEA) GEA is a single unified model capable of grounding itself across varied domains through a multi-embodiment action tokenizer. Our findings reveal the importance of training with cross-domain data and online RL for building generalist agents.
arXiv Detail & Related papers (2024-12-11T15:06:25Z)
QARM: Quantitative Alignment Multi-Modal Recommendation at Kuaishou [23.818456863262494]
We introduce a quantitative multi-modal framework to customize the specialized and trainable multi-modal information for different downstream models. Inspired by the two difficulties challenges in downstream tasks usage, we introduce a quantitative multi-modal framework to customize the specialized and trainable multi-modal information for different downstream models.
arXiv Detail & Related papers (2024-11-18T17:08:35Z)
HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models [28.993221775758702]
Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability. This paper marks a significant advance toward more flexible and comprehensive model merging techniques. We train policy and value networks using offline sampling of weight vectors, which are then employed for the online optimization of merging strategies.
arXiv Detail & Related papers (2024-09-27T16:31:31Z)
Pareto Front Approximation for Multi-Objective Session-Based Recommender Systems [0.0]
MultiTRON is an approach that adapts approximation techniques to multi-objective session-based recommender systems. Our approach optimize trade-offs between key metrics such as click-through and conversion rates by training on sampled preference vectors. We validate the model's performance through extensive offline and online evaluation.
arXiv Detail & Related papers (2024-07-23T20:38:23Z)
MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild [81.32127423981426]
Multimodal emotion recognition based on audio and video data is important for real-world applications. Recent methods have focused on exploiting advances of self-supervised learning (SSL) for pre-training of strong multimodal encoders. We propose a different perspective on the problem and investigate the advancement of multimodal DFER performance by adapting SSL-pre-trained disjoint unimodal encoders.
arXiv Detail & Related papers (2024-04-13T13:39:26Z)
Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond [87.1712108247199]
Our goal is to establish a Unified paradigm for Multi-modal Personalization systems (UniMP) We develop a generic and personalization generative framework, that can handle a wide range of personalized needs. Our methodology enhances the capabilities of foundational language models for personalized tasks.
arXiv Detail & Related papers (2024-03-15T20:21:31Z)
Model Composition for Multimodal Large Language Models [71.5729418523411]
We propose a new paradigm through the model composition of existing MLLMs to create a new model that retains the modal understanding capabilities of each original model. Our basic implementation, NaiveMC, demonstrates the effectiveness of this paradigm by reusing modality encoders and merging LLM parameters.
arXiv Detail & Related papers (2024-02-20T06:38:10Z)
Scaling User Modeling: Large-scale Online User Representations for Ads Personalization in Meta [25.100342052990793]
Scaling User Modeling (SUM) is a framework widely deployed in Meta's ads ranking system. To date, SUM has been launched to hundreds of ads ranking models in Meta, processing hundreds of billions of user requests daily.
arXiv Detail & Related papers (2023-11-16T03:47:48Z)
Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations [53.76682562935373]
We introduce an efficient framework called textbfInteRecAgent, which employs LLMs as the brain and recommender models as tools. InteRecAgent achieves satisfying performance as a conversational recommender system, outperforming general-purpose LLMs.
arXiv Detail & Related papers (2023-08-31T07:36:44Z)
Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulator to Enhance Dialogue System [65.93577256431125]
We propose an alternative approach called User-Guided Response Optimization (UGRO) to combine it with a smaller task-oriented dialogue model. This approach uses LLM as annotation-free user simulator to assess dialogue responses, combining them with smaller fine-tuned end-to-end TOD models. Our approach outperforms previous state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2023-06-16T13:04:56Z)
Learning to Generalize Unseen Domains via Memory-based Multi-Source Meta-Learning for Person Re-Identification [59.326456778057384]
We propose the Memory-based Multi-Source Meta-Learning framework to train a generalizable model for unseen domains. We also present a meta batch normalization layer (MetaBN) to diversify meta-test features. Experiments demonstrate that our M$3$L can effectively enhance the generalization ability of the model for unseen domains.
arXiv Detail & Related papers (2020-12-01T11:38:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.