Token-level Collaborative Alignment for LLM-based Generative Recommendation
- URL: http://arxiv.org/abs/2601.18457v1
- Date: Mon, 26 Jan 2026 13:05:02 GMT
- Title: Token-level Collaborative Alignment for LLM-based Generative Recommendation
- Authors: Fake Lin, Binbin Hu, Zhi Zheng, Xi Zhu, Ziqi Liu, Zhiqiang Zhang, Jun Zhou, Tong Xu,
- Abstract summary: Token-level Collaborative Alignment for Recommendation (TCA4Rec) is a model-agnostic and plug-and-play framework.<n>We show that TCA4Rec consistently improves recommendation performance across a broad spectrum of CF models and LLM-based recommender systems.
- Score: 34.778534684670895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have demonstrated strong potential for generative recommendation by leveraging rich semantic knowledge. However, existing LLM-based recommender systems struggle to effectively incorporate collaborative filtering (CF) signals, due to a fundamental mismatch between item-level preference modeling in CF and token-level next-token prediction (NTP) optimization in LLMs. Prior approaches typically treat CF as contextual hints or representation bias, and resort to multi-stage training to reduce behavioral semantic space discrepancies, leaving CF unable to explicitly regulate LLM generation. In this work, we propose Token-level Collaborative Alignment for Recommendation (TCA4Rec), a model-agnostic and plug-and-play framework that establishes an explicit optimization-level interface between CF supervision and LLM generation. TCA4Rec consists of (i) Collaborative Tokenizer, which projects raw item-level CF logits into token-level distributions aligned with the LLM token space, and (ii) Soft Label Alignment, which integrates these CF-informed distributions with one-hot supervision to optimize a soft NTP objective. This design preserves the generative nature of LLM training while enabling collaborative alignment with essential user preference of CF models. We highlight TCA4Rec is compatible with arbitrary traditional CF models and generalizes across a wide range of decoder-based LLM recommender architectures. Moreover, it provides an explicit mechanism to balance behavioral alignment and semantic fluency, yielding generative recommendations that are both accurate and controllable. Extensive experiments demonstrate that TCA4Rec consistently improves recommendation performance across a broad spectrum of CF models and LLM-based recommender systems.
Related papers
- FACE: A General Framework for Mapping Collaborative Filtering Embeddings into LLM Tokens [28.971310672971914]
Large language models (LLMs) have been explored for integration with collaborative filtering (CF)-based recommendation systems.<n>A key challenge is that LLMs struggle to interpret the latent, non-semantic embeddings produced by CF approaches.<n>We propose FACE, a general interpretable framework that maps CF embeddings into pre-trained LLM tokens.
arXiv Detail & Related papers (2025-10-17T15:19:54Z) - Beyond Semantic Understanding: Preserving Collaborative Frequency Components in LLM-based Recommendation [7.014265360936046]
FreLLM4Rec is an approach designed to balance semantic and collaborative information from a spectral perspective.<n>Experiments on four benchmark datasets demonstrate that FreLLM4Rec successfully mitigates collaborative signal attenuation.
arXiv Detail & Related papers (2025-08-14T03:33:02Z) - Federated Learning-Enabled Hybrid Language Models for Communication-Efficient Token Transmission [87.68447072141402]
Hybrid Language Models (HLMs) combine the low-latency efficiency of Small Language Models (SLMs) on edge devices with the high accuracy of Large Language Models (LLMs) on centralized servers.<n>We propose FedHLM, a communication-efficient HLM framework that integrates uncertainty-aware inference with Federated Learning (FL)
arXiv Detail & Related papers (2025-06-30T02:56:11Z) - LLM2Rec: Large Language Models Are Powerful Embedding Models for Sequential Recommendation [49.78419076215196]
Sequential recommendation aims to predict users' future interactions by modeling collaborative filtering (CF) signals from historical behaviors of similar users or items.<n>Traditional sequential recommenders rely on ID-based embeddings, which capture CF signals through high-order co-occurrence patterns.<n>Recent advances in large language models (LLMs) have motivated text-based recommendation approaches that derive item representations from textual descriptions.<n>We argue that an ideal embedding model should seamlessly integrate CF signals with rich semantic representations to improve both in-domain and out-of-domain recommendation performance.
arXiv Detail & Related papers (2025-06-16T13:27:06Z) - LLMInit: A Free Lunch from Large Language Models for Selective Initialization of Recommendation [34.227734210743904]
Collaborative filtering models have shown strong performance in capturing user-item interactions for recommendation systems.<n>The emergence of large language models (LLMs) like GPT and LLaMA presents new possibilities for enhancing recommendation performance.
arXiv Detail & Related papers (2025-03-03T18:41:59Z) - LLM-KT: A Versatile Framework for Knowledge Transfer from Large Language Models to Collaborative Filtering [0.07793154724386657]
We present a flexible framework designed to enhance collaborative filtering (CF) models by seamlessly integrating LLM-generated features.
Our framework injects these features into an intermediate layer of any CF model, allowing the model to reconstruct and leverage the embeddings internally.
Our framework is built for easy integration and modification, providing researchers and developers with a powerful tool for extending CF model capabilities.
arXiv Detail & Related papers (2024-11-01T13:09:30Z) - DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System [83.34921966305804]
Large language models (LLMs) have demonstrated remarkable performance in recommender systems.<n>We propose a novel plug-and-play alignment framework for LLMs and collaborative models.<n>Our method is superior to existing state-of-the-art algorithms.
arXiv Detail & Related papers (2024-08-15T15:56:23Z) - LLM4MSR: An LLM-Enhanced Paradigm for Multi-Scenario Recommendation [52.55639178180821]
The study on multi-scenario recommendation (MSR) has attracted much attention, which uses the data from all scenarios to simultaneously improve their recommendation performance.<n>Existing methods tend to integrate insufficient scenario knowledge and neglect learning personalized cross-scenario preferences, thus leading to sub-optimal performance.<n>We propose a large language model (LLM)-enhanced paradigm LLM4MSR to fill these gaps.
arXiv Detail & Related papers (2024-06-18T11:59:36Z) - Faithful Explanations of Black-box NLP Models Using LLM-generated
Counterfactuals [67.64770842323966]
Causal explanations of predictions of NLP systems are essential to ensure safety and establish trust.
Existing methods often fall short of explaining model predictions effectively or efficiently.
We propose two approaches for counterfactual (CF) approximation.
arXiv Detail & Related papers (2023-10-01T07:31:04Z) - FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large
Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution.
We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios.
We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.