LLM-Aligned Geographic Item Tokenization for Local-Life Recommendation
- URL: http://arxiv.org/abs/2511.14221v1
- Date: Tue, 18 Nov 2025 07:54:32 GMT
- Title: LLM-Aligned Geographic Item Tokenization for Local-Life Recommendation
- Authors: Hao Jiang, Guoquan Wang, Donglin Zhou, Sheng Yu, Yang Zeng, Wencong Zeng, Kun Gai, Guorui Zhou,
- Abstract summary: LGSID is an LLM-Hierarchical Geographic Item Tokenization Framework for Local-life Recommendation.<n>We introduce a novel G-DPO algorithm that uses pre-trained reward model to inject generalized spatial knowledge and collaborative signals into LLMs.<n>Experiments on real-world Kuaishou industry datasets show that LGSID consistently outperforms state-of-the-art discriminative and generative recommendation models.
- Score: 21.430438767288106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in Large Language Models (LLMs) have enhanced text-based recommendation by enriching traditional ID-based methods with semantic generalization capabilities. Text-based methods typically encode item textual information via prompt design and generate discrete semantic IDs through item tokenization. However, in domain-specific tasks such as local-life services, simply injecting location information into prompts fails to capture fine-grained spatial characteristics and real-world distance awareness among items. To address this, we propose LGSID, an LLM-Aligned Geographic Item Tokenization Framework for Local-life Recommendation. This framework consists of two key components: (1) RL-based Geographic LLM Alignment, and (2) Hierarchical Geographic Item Tokenization. In the RL-based alignment module, we initially train a list-wise reward model to capture real-world spatial relationships among items. We then introduce a novel G-DPO algorithm that uses pre-trained reward model to inject generalized spatial knowledge and collaborative signals into LLMs while preserving their semantic understanding. Furthermore, we propose a hierarchical geographic item tokenization strategy, where primary tokens are derived from discrete spatial and content attributes, and residual tokens are refined using the aligned LLM's geographic representation vectors. Extensive experiments on real-world Kuaishou industry datasets show that LGSID consistently outperforms state-of-the-art discriminative and generative recommendation models. Ablation studies, visualizations, and case studies further validate its effectiveness.
Related papers
- Unleashing the Native Recommendation Potential: LLM-Based Generative Recommendation via Structured Term Identifiers [51.64398574262054]
This paper introduces Term IDs (TIDs), defined as a set of semantically rich and standardized textual keywords, to serve as robust item identifiers.<n>We propose GRLM, a novel framework centered on TIDs, to convert item's metadata into standardized TIDs and utilize Integrative Instruction Fine-tuning to collaboratively optimize term internalization and sequential recommendation.
arXiv Detail & Related papers (2026-01-11T07:53:20Z) - Reasoning Over Space: Enabling Geographic Reasoning for LLM-Based Generative Next POI Recommendation [8.829656404389178]
Reasoning Over Space (ROS) is a framework that utilizes geography as a vital decision variable within the reasoning process.<n> ROS introduces a Hierarchical Spatial Semantic ID (SID) that discretizes coarse-to-fine locality and POI semantics into compositional tokens.<n>We further align the model with real world geography via spatial-guided Reinforcement Learning (RL)
arXiv Detail & Related papers (2026-01-08T03:46:03Z) - Spatial Preference Rewarding for MLLMs Spatial Understanding [92.25703021388142]
Multimodal large language models (MLLMs) have demonstrated promising spatial understanding capabilities.<n>Despite their successes, MLLMs still fall short in fine-grained spatial perception abilities.<n>We propose a Spatial Preference Rewarding(SPR) approach that enhances MLLMs' spatial capabilities.
arXiv Detail & Related papers (2025-10-16T07:16:18Z) - RecBase: Generative Foundation Model Pretraining for Zero-Shot Recommendation [78.01030342481246]
RecBase is a domain-agnostic foundational model pretrained with a recommendation-oriented objective.<n>We introduce a unified item tokenizer that encodes items into hierarchical concept identifiers.<n>Our model matches or surpasses the performance of LLM baselines up to 7B parameters in zero-shot and cross-domain recommendation tasks.
arXiv Detail & Related papers (2025-09-03T08:33:43Z) - Automated Label Placement on Maps via Large Language Models [3.7553323195283697]
We introduce a new paradigm for automatic label placement (ALP) that formulates the task as a data editing problem.<n>To support this direction, we curate MAPLE, the first known benchmarking dataset for evaluating ALP on real-world maps.<n>We evaluate four open-source LLMs on MAPLE, analyzing both overall performance and generalization across different types of landmarks.
arXiv Detail & Related papers (2025-07-29T18:00:22Z) - Beyond the Surface: Uncovering Implicit Locations with LLMs for Personalized Local News [0.2749898166276854]
This paper explores Large Language Models (LLMs) for local article classification in Taboola's "Homepage For You" system.<n>LLMs offer new possibilities while raising concerns about accuracy and explainability.<n>A scalable pipeline integrating LLM-based location classification boosted local article distribution by 27%.
arXiv Detail & Related papers (2025-02-20T15:55:52Z) - Grid-Based Projection of Spatial Data into Knowledge Graphs [1.3723120574076129]
We show how efficiently the spatial characteristics of real-world entities can be encoded within knowledge graphs.
We introduce a novel methodology for representing street networks in knowledge graphs, diverging from the conventional practice of individually capturing each street segment.
arXiv Detail & Related papers (2024-11-04T17:35:41Z) - DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation [8.422110274212503]
Weakly supervised semantic segmentation approaches typically rely on class activation maps (CAMs) for initial seed generation.
We introduce DALNet, which leverages text embeddings to enhance the comprehensive understanding and precise localization of objects across different levels of granularity.
Our approach, in particular, allows for more efficient end-to-end process as a single-stage method.
arXiv Detail & Related papers (2024-09-24T06:51:49Z) - Swarm Intelligence in Geo-Localization: A Multi-Agent Large Vision-Language Model Collaborative Framework [51.26566634946208]
We introduce smileGeo, a novel visual geo-localization framework.
By inter-agent communication, smileGeo integrates the inherent knowledge of these agents with additional retrieved information.
Results show that our approach significantly outperforms current state-of-the-art methods.
arXiv Detail & Related papers (2024-08-21T03:31:30Z) - Language Representations Can be What Recommenders Need: Findings and Potentials [57.90679739598295]
We show that item representations, when linearly mapped from advanced LM representations, yield superior recommendation performance.<n>This outcome suggests the possible homomorphism between the advanced language representation space and an effective item representation space for recommendation.<n>Our findings highlight the connection between language modeling and behavior modeling, which can inspire both natural language processing and recommender system communities.
arXiv Detail & Related papers (2024-07-07T17:05:24Z) - Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - Learnable Item Tokenization for Generative Recommendation [113.80559032128065]
We propose LETTER (a LEarnable Tokenizer for generaTivE Recommendation), which integrates hierarchical semantics, collaborative signals, and code assignment diversity.<n> LETTER incorporates Residual Quantized VAE for semantic regularization, a contrastive alignment loss for collaborative regularization, and a diversity loss to mitigate code assignment bias.
arXiv Detail & Related papers (2024-05-12T15:49:38Z) - Region-Based Semantic Factorization in GANs [67.90498535507106]
We present a highly efficient algorithm to factorize the latent semantics learned by Generative Adversarial Networks (GANs) concerning an arbitrary image region.
Through an appropriately defined generalized Rayleigh quotient, we solve such a problem without any annotations or training.
Experimental results on various state-of-the-art GAN models demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-02-19T17:46:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.