Multi-Grained Patch Training for Efficient LLM-based Recommendation
- URL: http://arxiv.org/abs/2501.15087v2
- Date: Mon, 19 May 2025 03:34:25 GMT
- Title: Multi-Grained Patch Training for Efficient LLM-based Recommendation
- Authors: Jiayi Liao, Ruobing Xie, Sihang Li, Xiang Wang, Xingwu Sun, Zhanhui Kang, Xiangnan He,
- Abstract summary: Large Language Models (LLMs) have emerged as a new paradigm for recommendation by converting interacted item history into language modeling.<n>We propose PatchRec, a multi-grained patch training method consisting of two stages: Patch Pre-training, which familiarizes LLMs with aggregated embeddings -- patches, and Patch Fine-tuning, which enables LLMs to capture time-aware significance in interaction history.
- Score: 40.5721110129484
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have emerged as a new paradigm for recommendation by converting interacted item history into language modeling. However, constrained by the limited context length of LLMs, existing approaches have to truncate item history in the prompt, focusing only on recent interactions and sacrificing the ability to model long-term history. To enable LLMs to model long histories, we pursue a concise embedding representation for items and sessions. In the LLM embedding space, we construct an item's embedding by aggregating its textual token embeddings; similarly, we construct a session's embedding by aggregating its item embeddings. While efficient, this way poses two challenges since it ignores the temporal significance of user interactions and LLMs do not natively interpret our custom embeddings. To overcome these, we propose PatchRec, a multi-grained patch training method consisting of two stages: (1) Patch Pre-training, which familiarizes LLMs with aggregated embeddings -- patches, and (2) Patch Fine-tuning, which enables LLMs to capture time-aware significance in interaction history. Extensive experiments show that PatchRec effectively models longer behavior histories with improved efficiency. This work facilitates the practical use of LLMs for modeling long behavior histories. Codes are available at https://github.com/ljy0ustc/PatchRec.
Related papers
- Towards Event Extraction with Massive Types: LLM-based Collaborative Annotation and Partitioning Extraction [66.73721939417507]
We propose a collaborative annotation method based on Large Language Models (LLMs)<n>We also propose an LLM-based Partitioning EE method called LLM-PEE.<n>The results show that LLM-PEE outperforms the state-of-the-art methods by 5.4 in event detection and 6.1 in argument extraction.
arXiv Detail & Related papers (2025-03-04T13:53:43Z) - New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM Collaboration [49.180693704510006]
Referring Expression (REC) is a cross-modal task that evaluates the interplay of language understanding, image comprehension, and language-to-image grounding.<n>It serves as an essential testing ground for Multimodal Large Language Models (MLLMs)
arXiv Detail & Related papers (2025-02-27T13:58:44Z) - Towards a Unified Paradigm: Integrating Recommendation Systems as a New Language in Large Models [33.02146794292383]
We introduce a new concept, "Integrating Recommendation Systems as a New Language in Large Models" (RSLLM)<n>RSLLM uses a unique prompting method that combines ID-based item embeddings from conventional recommendation models with textual item features.<n>It treats users' sequential behaviors as a distinct language and aligns the ID embeddings with the LLM's input space using a projector.
arXiv Detail & Related papers (2024-12-22T09:08:46Z) - Two are better than one: Context window extension with multi-grained self-injection [111.1376461868317]
SharedLLM is a novel approach grounded in the design philosophy of multi-grained context compression and query-aware information retrieval.
We introduce a specialized tree-style data structure to efficiently encode, store and retrieve multi-grained contextual information for text chunks.
arXiv Detail & Related papers (2024-10-25T06:08:59Z) - Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks.
We modify specific context tokens, considering the unique structure of input and output formats.
Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss.
arXiv Detail & Related papers (2024-10-22T17:45:47Z) - Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models [79.70436109672599]
We derive non-vacuous generalization bounds for large language models as large as LLaMA2-70B.
Our work achieves the first non-vacuous bounds for models that are deployed in practice and generate high-quality text.
arXiv Detail & Related papers (2024-07-25T16:13:58Z) - Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding [54.532578213126065]
Most document understanding methods preserve all tokens within sub-images and treat them equally.
This neglects their different informativeness and leads to a significant increase in the number of image tokens.
We propose Token-level Correlation-guided Compression, a parameter-free and plug-and-play methodology to optimize token processing.
arXiv Detail & Related papers (2024-07-19T16:11:15Z) - Beyond the Turn-Based Game: Enabling Real-Time Conversations with Duplex Models [66.24055500785657]
Traditional turn-based chat systems prevent users from verbally interacting with system while it is generating responses.
To overcome these limitations, we adapt existing LLMs to listen users while generating output and provide users with instant feedback.
We build a dataset consisting of alternating time slices of queries and responses as well as covering typical feedback types in instantaneous interactions.
arXiv Detail & Related papers (2024-06-22T03:20:10Z) - Text-like Encoding of Collaborative Information in Large Language Models for Recommendation [58.87865271693269]
We introduce BinLLM, a novel method to seamlessly integrate collaborative information with Large Language Models for Recommendation (LLMRec)
BinLLM converts collaborative embeddings from external models into binary sequences.
BinLLM provides options to compress the binary sequence using dot-decimal notation to avoid excessively long lengths.
arXiv Detail & Related papers (2024-06-05T12:45:25Z) - Hidden in Plain Sight: Exploring Chat History Tampering in Interactive Language Models [12.920884182101142]
Large Language Models (LLMs) have become prevalent in real-world applications, exhibiting impressive text generation performance.
To behave interactively, LLM-based chat systems must integrate prior chat history as context into their inputs, following a pre-defined structure.
This paper introduces a systematic methodology to inject user-supplied history into LLM conversations without any prior knowledge of the target model.
arXiv Detail & Related papers (2024-05-30T16:36:47Z) - Reindex-Then-Adapt: Improving Large Language Models for Conversational Recommendation [50.19602159938368]
Large language models (LLMs) are revolutionizing conversational recommender systems.
We propose a Reindex-Then-Adapt (RTA) framework, which converts multi-token item titles into single tokens within LLMs.
Our framework demonstrates improved accuracy metrics across three different conversational recommendation datasets.
arXiv Detail & Related papers (2024-05-20T15:37:55Z) - Extract, Define, Canonicalize: An LLM-based Framework for Knowledge Graph Construction [12.455647753787442]
We propose a three-phase framework named Extract-Define-Canonicalize (EDC)
EDC is flexible in that it can be applied to settings where a pre-defined target schema is available and when it is not.
We demonstrate EDC is able to extract high-quality triplets without any parameter tuning and with significantly larger schemas compared to prior works.
arXiv Detail & Related papers (2024-04-05T02:53:51Z) - LlamaRec: Two-Stage Recommendation using Large Language Models for
Ranking [10.671747198171136]
We propose a two-stage framework using large language models for ranking-based recommendation (LlamaRec)
In particular, we use small-scale sequential recommenders to retrieve candidates based on the user interaction history.
LlamaRec consistently achieves datasets superior performance in both recommendation performance and efficiency.
arXiv Detail & Related papers (2023-10-25T06:23:48Z) - LLM-augmented Preference Learning from Natural Language [19.700169351688768]
Large Language Models (LLMs) are equipped to deal with larger context lengths.
LLMs can consistently outperform the SotA when the target text is large.
Few-shot learning yields better performance than zero-shot learning.
arXiv Detail & Related papers (2023-10-12T17:17:27Z) - LLMLingua: Compressing Prompts for Accelerated Inference of Large
Language Models [22.06402870816756]
Large language models (LLMs) have been applied in various applications due to their astonishing capabilities.
This paper presents LLMLingua, a coarse-to-fine prompt compression method that involves a budget controller to maintain semantic integrity.
We show that the proposed approach yields state-of-the-art performance and allows for up to 20x compression with little performance loss.
arXiv Detail & Related papers (2023-10-09T14:10:21Z) - Revisiting Large Language Models as Zero-shot Relation Extractors [8.953462875381888]
Relation extraction (RE) consistently involves a certain degree of labeled or unlabeled data even if under zero-shot setting.
Recent studies have shown that large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt.
This work focuses on the study of exploring LLMs as zero-shot relation extractors.
arXiv Detail & Related papers (2023-10-08T06:17:39Z) - Scaling Sentence Embeddings with Large Language Models [43.19994568210206]
In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance.
Our approach involves adapting the previous prompt-based representation method for autoregressive models.
By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity tasks.
arXiv Detail & Related papers (2023-07-31T13:26:03Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.