Zero-Shot Recommendation as Language Modeling
- URL: http://arxiv.org/abs/2112.04184v1
- Date: Wed, 8 Dec 2021 09:16:03 GMT
- Title: Zero-Shot Recommendation as Language Modeling
- Authors: Damien Sileo, Wout Vossen, Robbe Raymaekers
- Abstract summary: We propose a framework for recommendation with off-the-shelf pretrained language models (LM)
We construct a textual prompt to estimate the affinity between $u$ and $m$ with LM likelihood.
We motivate our idea with a corpus analysis, evaluate several prompt structures, and we compare LM-based recommendation with standard matrix factorization trained on different data regimes.
- Score: 1.0312968200748118
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recommendation is the task of ranking items (e.g. movies or products)
according to individual user needs. Current systems rely on collaborative
filtering and content-based techniques, which both require structured training
data. We propose a framework for recommendation with off-the-shelf pretrained
language models (LM) that only used unstructured text corpora as training data.
If a user $u$ liked \textit{Matrix} and \textit{Inception}, we construct a
textual prompt, e.g. \textit{"Movies like Matrix, Inception, ${<}m{>}$"} to
estimate the affinity between $u$ and $m$ with LM likelihood. We motivate our
idea with a corpus analysis, evaluate several prompt structures, and we compare
LM-based recommendation with standard matrix factorization trained on different
data regimes. The code for our experiments is publicly available
(https://colab.research.google.com/drive/1f1mlZ-FGaLGdo5rPzxf3vemKllbh2esT?usp=sharing).
Related papers
- PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval [76.50690734636477]
We propose PromptReps, which combines the advantages of both categories: no need for training and the ability to retrieve from the whole corpus.
The retrieval system harnesses both dense text embedding and sparse bag-of-words representations.
arXiv Detail & Related papers (2024-04-29T04:51:30Z) - LAMPAT: Low-Rank Adaption for Multilingual Paraphrasing Using Adversarial Training [19.173992333194683]
Paraphrases are texts that convey the same meaning while using different words or sentence structures.
Previous studies have leveraged the knowledge from the machine translation field, forming a paraphrase through zero-shot machine translation in the same language.
We propose the first unsupervised multilingual paraphrasing model, LAMPAT, by which monolingual dataset is sufficient enough to generate a human-like and diverse sentence.
arXiv Detail & Related papers (2024-01-09T04:19:16Z) - Tuna: Instruction Tuning using Feedback from Large Language Models [74.04950416204551]
We propose finetuning an instruction-tuned large language model using our novel textitprobabilistic ranking and textitcontextual ranking approaches.
Probabilistic ranking enables the instruction-tuned model to inherit the relative rankings of high-quality and low-quality responses from the teacher LLM.
On the other hand, learning with contextual ranking allows the model to refine its own response distribution using the contextual understanding ability of stronger LLMs.
arXiv Detail & Related papers (2023-10-20T09:55:06Z) - Retrieval-Pretrained Transformer: Long-range Language Modeling with Self-retrieval [51.437420003471615]
We propose the Retrieval-Pretrained Transformer (RPT), an architecture and training procedure for jointly training a retrieval-augmented LM from scratch.
RPT improves retrieval quality and subsequently perplexity across the board compared to strong baselines.
arXiv Detail & Related papers (2023-06-23T10:18:02Z) - Generating EDU Extracts for Plan-Guided Summary Re-Ranking [77.7752504102925]
Two-step approaches, in which summary candidates are generated-then-reranked to return a single summary, can improve ROUGE scores over the standard single-step approach.
We design a novel method to generate candidates for re-ranking that addresses these issues.
We show large relevance improvements over previously published methods on widely used single document news article corpora.
arXiv Detail & Related papers (2023-05-28T17:22:04Z) - Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study [44.39031420687302]
Large language models (LLMs) are becoming attractive as few-shot reasoners to solve Natural Language (NL)-related tasks.
We try to understand this by designing a benchmark to evaluate the structural understanding capabilities of LLMs.
We propose $textitself-augmentation$ for effective structural prompting, such as critical value / range identification.
arXiv Detail & Related papers (2023-05-22T14:23:46Z) - TextBox 2.0: A Text Generation Library with Pre-trained Language Models [72.49946755856935]
This paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs)
To be comprehensive, our library covers $13$ common text generation tasks and their corresponding $83$ datasets.
We also implement $4$ efficient training strategies and provide $4$ generation objectives for pre-training new PLMs from scratch.
arXiv Detail & Related papers (2022-12-26T03:50:36Z) - You can't pick your neighbors, or can you? When and how to rely on
retrieval in the $k$NN-LM [65.74934004876914]
Retrieval-enhanced language models (LMs) condition their predictions on text retrieved from large external datastores.
One such approach, the $k$NN-LM, interpolates any existing LM's predictions with the output of a $k$-nearest neighbors model.
We empirically measure the effectiveness of our approach on two English language modeling datasets.
arXiv Detail & Related papers (2022-10-28T02:57:40Z) - A Spectral Approach to Item Response Theory [6.5268245109828005]
We propose a emphnew item estimation algorithm for the Rasch model.
The core of our algorithm is the computation of the stationary distribution of a Markov chain defined on an item-item graph.
Experiments on synthetic and real-life datasets show that our algorithm is scalable, accurate, and competitive with the most commonly used methods in the literature.
arXiv Detail & Related papers (2022-10-09T18:57:08Z) - CUE Vectors: Modular Training of Language Models Conditioned on Diverse
Contextual Signals [11.310756148007753]
We propose a framework to modularize the training of neural language models that use diverse forms of sentence-external context (including metadata)
Our approach, contextual universal embeddings (CUE), trains LMs on one set of context, such as date and author, and adapts to novel metadata types, such as article title, or previous sentence.
We validate the CUE framework on a NYTimes text corpus with multiple metadata types, for which the LM perplexity can be lowered from 36.6 to 27.4 by conditioning on context.
arXiv Detail & Related papers (2022-03-16T17:37:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.