FACE: A General Framework for Mapping Collaborative Filtering Embeddings into LLM Tokens
- URL: http://arxiv.org/abs/2510.15729v1
- Date: Fri, 17 Oct 2025 15:19:54 GMT
- Title: FACE: A General Framework for Mapping Collaborative Filtering Embeddings into LLM Tokens
- Authors: Chao Wang, Yixin Song, Jinhui Ye, Chuan Qin, Dazhong Shen, Lingfeng Liu, Xiang Wang, Yanyong Zhang,
- Abstract summary: Large language models (LLMs) have been explored for integration with collaborative filtering (CF)-based recommendation systems.<n>A key challenge is that LLMs struggle to interpret the latent, non-semantic embeddings produced by CF approaches.<n>We propose FACE, a general interpretable framework that maps CF embeddings into pre-trained LLM tokens.
- Score: 28.971310672971914
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, large language models (LLMs) have been explored for integration with collaborative filtering (CF)-based recommendation systems, which are crucial for personalizing user experiences. However, a key challenge is that LLMs struggle to interpret the latent, non-semantic embeddings produced by CF approaches, limiting recommendation effectiveness and further applications. To address this, we propose FACE, a general interpretable framework that maps CF embeddings into pre-trained LLM tokens. Specifically, we introduce a disentangled projection module to decompose CF embeddings into concept-specific vectors, followed by a quantized autoencoder to convert continuous embeddings into LLM tokens (descriptors). Then, we design a contrastive alignment objective to ensure that the tokens align with corresponding textual signals. Hence, the model-agnostic FACE framework achieves semantic alignment without fine-tuning LLMs and enhances recommendation performance by leveraging their pre-trained capabilities. Empirical results on three real-world recommendation datasets demonstrate performance improvements in benchmark models, with interpretability studies confirming the interpretability of the descriptors. Code is available in https://github.com/YixinRoll/FACE.
Related papers
- Token-level Collaborative Alignment for LLM-based Generative Recommendation [34.778534684670895]
Token-level Collaborative Alignment for Recommendation (TCA4Rec) is a model-agnostic and plug-and-play framework.<n>We show that TCA4Rec consistently improves recommendation performance across a broad spectrum of CF models and LLM-based recommender systems.
arXiv Detail & Related papers (2026-01-26T13:05:02Z) - ReaLM: Residual Quantization Bridging Knowledge Graph Embeddings and Large Language Models [18.720486146234077]
Large Language Models (LLMs) have emerged as a powerful paradigm for Knowledge Graph Completion (KGC)<n>We propose ReaLM, a novel and effective framework that bridges the gap between KG embeddings and LLM tokenization.<n>We show that ReaLM achieves state-of-the-art performance, confirming its effectiveness in aligning structured knowledge with large-scale language models.
arXiv Detail & Related papers (2025-10-10T04:36:13Z) - FuDoBa: Fusing Document and Knowledge Graph-based Representations with Bayesian Optimisation [43.56253799373878]
We introduce FuDoBa, a Bayesian optimisation-based method that integrates LLM-based embeddings with domain-specific structured knowledge.<n>This fusion produces low-dimensional, task-relevant representations while reducing training complexity and yielding interpretable early-fusion weights.<n>We demonstrate the effectiveness of our approach on six datasets in two domains, showing that our proposed representation learning approach performs on par with, or surpasses, those produced solely by the proprietary LLM-based embedding baselines.
arXiv Detail & Related papers (2025-07-09T07:49:55Z) - When Transformers Meet Recommenders: Integrating Self-Attentive Sequential Recommendation with Fine-Tuned LLMs [0.0]
SASRecLLM is a novel framework that integrates SASRec as a collaborative encoder with an LLM fine-tuned using Low-Rank Adaptation (LoRA)<n>Experiments on multiple datasets demonstrate that SASRecLLM achieves robust and consistent improvements over strong baselines in both cold-start and warm-start scenarios.
arXiv Detail & Related papers (2025-07-08T07:26:55Z) - LLM2Rec: Large Language Models Are Powerful Embedding Models for Sequential Recommendation [49.78419076215196]
Sequential recommendation aims to predict users' future interactions by modeling collaborative filtering (CF) signals from historical behaviors of similar users or items.<n>Traditional sequential recommenders rely on ID-based embeddings, which capture CF signals through high-order co-occurrence patterns.<n>Recent advances in large language models (LLMs) have motivated text-based recommendation approaches that derive item representations from textual descriptions.<n>We argue that an ideal embedding model should seamlessly integrate CF signals with rich semantic representations to improve both in-domain and out-of-domain recommendation performance.
arXiv Detail & Related papers (2025-06-16T13:27:06Z) - MLLM-Guided VLM Fine-Tuning with Joint Inference for Zero-Shot Composed Image Retrieval [50.062817677022586]
Zero-Shot Image Retrieval (ZS-CIR) methods typically train adapters that convert reference images into pseudo-text tokens.<n>We propose MLLM-Guided VLM Fine-Tuning with Joint Inference (MVFT-JI) to construct two complementary training tasks using only unlabeled images.
arXiv Detail & Related papers (2025-05-26T08:56:59Z) - Training Large Recommendation Models via Graph-Language Token Alignment [53.3142545812349]
We propose a novel framework to train Large Recommendation models via Graph-Language Token Alignment.<n>By aligning item and user nodes from the interaction graph with pretrained LLM tokens, GLTA effectively leverages the reasoning abilities of LLMs.<n> Furthermore, we introduce Graph-Language Logits Matching (GLLM) to optimize token alignment for end-to-end item prediction.
arXiv Detail & Related papers (2025-02-26T02:19:10Z) - DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System [83.34921966305804]
Large language models (LLMs) have demonstrated remarkable performance in recommender systems.<n>We propose a novel plug-and-play alignment framework for LLMs and collaborative models.<n>Our method is superior to existing state-of-the-art algorithms.
arXiv Detail & Related papers (2024-08-15T15:56:23Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)<n>We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.