Related papers: Revealing Potential Biases in LLM-Based Recommender Systems in the Cold Start Setting

Revealing Potential Biases in LLM-Based Recommender Systems in the Cold Start Setting

URL: http://arxiv.org/abs/2508.20401v2
Date: Fri, 05 Sep 2025 18:12:25 GMT
Title: Revealing Potential Biases in LLM-Based Recommender Systems in the Cold Start Setting
Authors: Alexandre Andre, Gauthier Roy, Eva Dyer, Kai Wang,
Abstract summary: Large Language Models (LLMs) are increasingly used for recommendation tasks due to their general-purpose capabilities.<n>We introduce a benchmark specifically designed to evaluate fairness in zero-context recommendation.<n>Our modular pipeline supports recommendation domains and sensitive attributes, enabling systematic and flexible audits of any open-source LLM.
Score: 41.964130989754516
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are increasingly used for recommendation tasks due to their general-purpose capabilities. While LLMs perform well in rich-context settings, their behavior in cold-start scenarios, where only limited signals such as age, gender, or language are available, raises fairness concerns because they may rely on societal biases encoded during pretraining. We introduce a benchmark specifically designed to evaluate fairness in zero-context recommendation. Our modular pipeline supports configurable recommendation domains and sensitive attributes, enabling systematic and flexible audits of any open-source LLM. Through evaluations of state-of-the-art models (Gemma 3 and Llama 3.2), we uncover consistent biases across recommendation domains (music, movies, and colleges) including gendered and cultural stereotypes. We also reveal a non-linear relationship between model size and fairness, highlighting the need for nuanced analysis.

Related papers

Evaluating Position Bias in Large Language Model Recommendations [3.430780143519032]
Large Language Models (LLMs) are being increasingly explored as general-purpose tools for recommendation tasks.<n>We show that LLM-based recommendation models suffer from position bias, where the order of candidate items in a prompt can disproportionately influence the recommendations produced by LLMs.<n>We introduce a new prompting strategy to mitigate the position bias of LLM recommendation models called Ranking via Iterative SElection.
arXiv Detail & Related papers (2025-08-04T03:30:26Z)
LLM2Rec: Large Language Models Are Powerful Embedding Models for Sequential Recommendation [49.78419076215196]
Sequential recommendation aims to predict users' future interactions by modeling collaborative filtering (CF) signals from historical behaviors of similar users or items.<n>Traditional sequential recommenders rely on ID-based embeddings, which capture CF signals through high-order co-occurrence patterns.<n>Recent advances in large language models (LLMs) have motivated text-based recommendation approaches that derive item representations from textual descriptions.<n>We argue that an ideal embedding model should seamlessly integrate CF signals with rich semantic representations to improve both in-domain and out-of-domain recommendation performance.
arXiv Detail & Related papers (2025-06-16T13:27:06Z)
LLM is Knowledge Graph Reasoner: LLM's Intuition-aware Knowledge Graph Reasoning for Cold-start Sequential Recommendation [47.34949656215159]
Large Language Models (LLMs) can be considered databases with a wealth of knowledge learned from the web data.<n>We propose a LLM's Intuition-aware Knowledge graph Reasoning model (LIKR)<n>Our model outperforms state-of-the-art recommendation methods in cold-start sequential recommendation scenarios.
arXiv Detail & Related papers (2024-12-17T01:52:15Z)
A Normative Framework for Benchmarking Consumer Fairness in Large Language Model Recommender System [9.470545149911072]
This paper proposes a normative framework to benchmark consumer fairness in LLM-powered recommender systems. We argue that this gap can lead to arbitrary conclusions about fairness. Experiments on the MovieLens dataset on consumer fairness reveal fairness deviations in age-based recommendations.
arXiv Detail & Related papers (2024-05-03T16:25:27Z)
Tapping the Potential of Large Language Models as Recommender Systems: A Comprehensive Framework and Empirical Analysis [91.5632751731927]
Large Language Models such as ChatGPT have showcased remarkable abilities in solving general tasks.<n>We propose a general framework for utilizing LLMs in recommendation tasks, focusing on the capabilities of LLMs as recommenders.<n>We analyze the impact of public availability, tuning strategies, model architecture, parameter scale, and context length on recommendation results.
arXiv Detail & Related papers (2024-01-10T08:28:56Z)
How Can Recommender Systems Benefit from Large Language Models: A Survey [82.06729592294322]
Large language models (LLM) have shown impressive general intelligence and human-like capabilities. We conduct a comprehensive survey on this research direction from the perspective of the whole pipeline in real-world recommender systems.
arXiv Detail & Related papers (2023-06-09T11:31:50Z)
A Survey on Large Language Models for Recommendation [77.91673633328148]
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) This survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec)
arXiv Detail & Related papers (2023-05-31T13:51:26Z)
Is ChatGPT Fair for Recommendation? Evaluating Fairness in Large Language Model Recommendation [52.62492168507781]
We propose a novel benchmark called Fairness of Recommendation via LLM (FaiRLLM) This benchmark comprises carefully crafted metrics and a dataset that accounts for eight sensitive attributes. By utilizing our FaiRLLM benchmark, we conducted an evaluation of ChatGPT and discovered that it still exhibits unfairness to some sensitive attributes when generating recommendations.
arXiv Detail & Related papers (2023-05-12T16:54:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.