Related papers: Zero-Shot Recommendation as Language Modeling

Zero-Shot Recommendation as Language Modeling

URL: http://arxiv.org/abs/2112.04184v1
Date: Wed, 8 Dec 2021 09:16:03 GMT
Title: Zero-Shot Recommendation as Language Modeling
Authors: Damien Sileo, Wout Vossen, Robbe Raymaekers
Abstract summary: We propose a framework for recommendation with off-the-shelf pretrained language models (LM) We construct a textual prompt to estimate the affinity between $u$ and $m$ with LM likelihood. We motivate our idea with a corpus analysis, evaluate several prompt structures, and we compare LM-based recommendation with standard matrix factorization trained on different data regimes.
Score: 1.0312968200748118
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recommendation is the task of ranking items (e.g. movies or products) according to individual user needs. Current systems rely on collaborative filtering and content-based techniques, which both require structured training data. We propose a framework for recommendation with off-the-shelf pretrained language models (LM) that only used unstructured text corpora as training data. If a user $u$ liked \textit{Matrix} and \textit{Inception}, we construct a textual prompt, e.g. \textit{"Movies like Matrix, Inception, ${<}m{>}$"} to estimate the affinity between $u$ and $m$ with LM likelihood. We motivate our idea with a corpus analysis, evaluate several prompt structures, and we compare LM-based recommendation with standard matrix factorization trained on different data regimes. The code for our experiments is publicly available (https://colab.research.google.com/drive/1f1mlZ-FGaLGdo5rPzxf3vemKllbh2esT?usp=sharing).

Related papers

A Language-Driven Framework for Improving Personalized Recommendations: Merging LLMs with Traditional Algorithms [2.831462251544684]
Large Language Models (LLMs) have emerged as one of the most promising tools for natural language processing.<n>This research proposes a novel framework that mimics how a close friend would recommend items based on their knowledge of an individual's tastes.
arXiv Detail & Related papers (2025-07-09T19:48:33Z)
Rubric Is All You Need: Enhancing LLM-based Code Evaluation With Question-Specific Rubrics [1.3707925738322797]
We propose a new metric called as Leniency, which quantifies evaluation strictness relative to expert assessment.<n>Our comprehensive analysis demonstrates that emphquestion-specific rubrics significantly enhance logical assessment of code in educational settings.
arXiv Detail & Related papers (2025-03-31T11:59:43Z)
Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation [56.75665429851673]
This paper introduces a novel instruction curation algorithm, derived from two unique perspectives, human and LLM preference alignment.<n>Experiments demonstrate that we can maintain or even improve model performance by compressing synthetic multimodal instructions by up to 90%.
arXiv Detail & Related papers (2024-09-27T08:20:59Z)
PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval [76.50690734636477]
We propose PromptReps, which combines the advantages of both categories: no need for training and the ability to retrieve from the whole corpus. The retrieval system harnesses both dense text embedding and sparse bag-of-words representations.
arXiv Detail & Related papers (2024-04-29T04:51:30Z)
LAMPAT: Low-Rank Adaption for Multilingual Paraphrasing Using Adversarial Training [19.173992333194683]
Paraphrases are texts that convey the same meaning while using different words or sentence structures. Previous studies have leveraged the knowledge from the machine translation field, forming a paraphrase through zero-shot machine translation in the same language. We propose the first unsupervised multilingual paraphrasing model, LAMPAT, by which monolingual dataset is sufficient enough to generate a human-like and diverse sentence.
arXiv Detail & Related papers (2024-01-09T04:19:16Z)
Tuna: Instruction Tuning using Feedback from Large Language Models [74.04950416204551]
We propose finetuning an instruction-tuned large language model using our novel textitprobabilistic ranking and textitcontextual ranking approaches. Probabilistic ranking enables the instruction-tuned model to inherit the relative rankings of high-quality and low-quality responses from the teacher LLM. On the other hand, learning with contextual ranking allows the model to refine its own response distribution using the contextual understanding ability of stronger LLMs.
arXiv Detail & Related papers (2023-10-20T09:55:06Z)
Retrieval-Pretrained Transformer: Long-range Language Modeling with Self-retrieval [51.437420003471615]
We propose the Retrieval-Pretrained Transformer (RPT), an architecture and training procedure for jointly training a retrieval-augmented LM from scratch. RPT improves retrieval quality and subsequently perplexity across the board compared to strong baselines.
arXiv Detail & Related papers (2023-06-23T10:18:02Z)
Generating EDU Extracts for Plan-Guided Summary Re-Ranking [77.7752504102925]
Two-step approaches, in which summary candidates are generated-then-reranked to return a single summary, can improve ROUGE scores over the standard single-step approach. We design a novel method to generate candidates for re-ranking that addresses these issues. We show large relevance improvements over previously published methods on widely used single document news article corpora.
arXiv Detail & Related papers (2023-05-28T17:22:04Z)
Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study [44.39031420687302]
Large language models (LLMs) are becoming attractive as few-shot reasoners to solve Natural Language (NL)-related tasks. We try to understand this by designing a benchmark to evaluate the structural understanding capabilities of LLMs. We propose $textitself-augmentation$ for effective structural prompting, such as critical value / range identification.
arXiv Detail & Related papers (2023-05-22T14:23:46Z)
Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes [54.13559879916708]
EVAPORATE is a prototype system powered by large language models (LLMs) Code synthesis is cheap, but far less accurate than directly processing each document with the LLM. We propose an extended code implementation, EVAPORATE-CODE+, which achieves better quality than direct extraction.
arXiv Detail & Related papers (2023-04-19T06:00:26Z)
TextBox 2.0: A Text Generation Library with Pre-trained Language Models [72.49946755856935]
This paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs) To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets. We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch.
arXiv Detail & Related papers (2022-12-26T03:50:36Z)
You can't pick your neighbors, or can you? When and how to rely on retrieval in the $k$NN-LM [65.74934004876914]
Retrieval-enhanced language models (LMs) condition their predictions on text retrieved from large external datastores. One such approach, the $k$NN-LM, interpolates any existing LM's predictions with the output of a $k$-nearest neighbors model. We empirically measure the effectiveness of our approach on two English language modeling datasets.
arXiv Detail & Related papers (2022-10-28T02:57:40Z)
A Spectral Approach to Item Response Theory [6.5268245109828005]
We propose a emphnew item estimation algorithm for the Rasch model. The core of our algorithm is the computation of the stationary distribution of a Markov chain defined on an item-item graph. Experiments on synthetic and real-life datasets show that our algorithm is scalable, accurate, and competitive with the most commonly used methods in the literature.
arXiv Detail & Related papers (2022-10-09T18:57:08Z)
Making Recommender Systems Forget: Learning and Unlearning for Erasable Recommendation [18.72554870460794]
LASER can not only achieve efficient unlearning, but also outperform the state-of-the-art unlearning framework in terms of model utility. Both theoretical analysis and experiments on two real-world datasets demonstrate that LASER can not only achieve efficient unlearning, but also outperform the state-of-the-art unlearning framework in terms of model utility.
arXiv Detail & Related papers (2022-03-22T06:56:06Z)
CUE Vectors: Modular Training of Language Models Conditioned on Diverse Contextual Signals [11.310756148007753]
We propose a framework to modularize the training of neural language models that use diverse forms of sentence-external context (including metadata) Our approach, contextual universal embeddings (CUE), trains LMs on one set of context, such as date and author, and adapts to novel metadata types, such as article title, or previous sentence. We validate the CUE framework on a NYTimes text corpus with multiple metadata types, for which the LM perplexity can be lowered from 36.6 to 27.4 by conditioning on context.
arXiv Detail & Related papers (2022-03-16T17:37:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.