Related papers: Beyond Single Labels: Improving Conversational Recommendation through LLM-Powered Data Augmentation

Beyond Single Labels: Improving Conversational Recommendation through LLM-Powered Data Augmentation

URL: http://arxiv.org/abs/2508.05657v1
Date: Wed, 30 Jul 2025 08:20:54 GMT
Title: Beyond Single Labels: Improving Conversational Recommendation through LLM-Powered Data Augmentation
Authors: Haozhe Xu, Xiaohua Wang, Changze Lv, Xiaoqing Zheng,
Abstract summary: Conversational recommender systems (CRSs) enhance recommendation quality by engaging users in multi-turn dialogues.<n>CRSs often face the false negative issue, where items that a user might like are incorrectly labeled as negative during training, leading to suboptimal recommendations.<n>We propose a novel data augmentation framework that first leverages an LLM-based semantic retriever to identify diverse and semantically relevant items.
Score: 18.01518720663732
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conversational recommender systems (CRSs) enhance recommendation quality by engaging users in multi-turn dialogues, capturing nuanced preferences through natural language interactions. However, these systems often face the false negative issue, where items that a user might like are incorrectly labeled as negative during training, leading to suboptimal recommendations.Expanding the label set through data augmentation presents an intuitive solution but faces the challenge of balancing two key aspects: ensuring semantic relevance and preserving the collaborative information inherent in CRS datasets. To address these issues, we propose a novel data augmentation framework that first leverages an LLM-based semantic retriever to identify diverse and semantically relevant items, which are then filtered by a relevance scorer to remove noisy candidates. Building on this, we introduce a two-stage training strategy balancing semantic relevance and collaborative information. Extensive experiments on two benchmark datasets and user simulators demonstrate significant and consistent performance improvements across various recommenders, highlighting the effectiveness of our approach in advancing CRS performance.

Related papers

Multi-agents based User Values Mining for Recommendation [52.26100802380767]
We propose a zero-shot multi-LLM collaborative framework for effective and accurate user value extraction.<n>We apply text summarization techniques to condense item content while preserving essential meaning.<n>To mitigate hallucinations, we introduce two specialized agent roles: evaluators and supervisors.
arXiv Detail & Related papers (2025-05-02T04:01:31Z)
Search-Based Interaction For Conversation Recommendation via Generative Reward Model Based Simulated User [117.82681846559909]
Conversational recommendation systems (CRSs) use multi-turn interaction to capture user preferences and provide personalized recommendations.<n>We propose a generative reward model based simulated user, named GRSU, for automatic interaction with CRSs.
arXiv Detail & Related papers (2025-04-29T06:37:30Z)
From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System [49.57258257916805]
Large Language Models (LLMs) demonstrate strong zero-shot recommendation capabilities.<n>Practical applications often favor smaller, internally managed recommender models due to scalability, interpretability, and data privacy constraints.<n>We propose an active data augmentation framework that synthesizes conversational training data by leveraging black-box LLMs guided by active learning techniques.
arXiv Detail & Related papers (2025-04-21T23:05:47Z)
AdaptRec: A Self-Adaptive Framework for Sequential Recommendations with Large Language Models [10.52052172996229]
AdaptRec is a self-adaptive fram-ework that leverages Large Language Models for sequential recommendations by incorporating explicit collaborative signals.<n>We develop a User-Contextualized Recommendation Prompt that translates their behavior sequences into natural language, explicitly integrating this information into the recommendation process.<n>Experiments demonstrate AdaptRec's superior performance, with significant improvements in HitRatio@1 scores of 7.13%, 18.16%, and 10.41% across real-world datasets.
arXiv Detail & Related papers (2025-04-06T00:30:50Z)
Semantic Retrieval Augmented Contrastive Learning for Sequential Recommendation [17.18176550968383]
We propose a novel approach named Semantic Retrieval Augmented Contrastive Learning (SRA-CL), which leverages semantic information to improve the reliability of contrastive samples.<n>SRA-CL comprises two main components: (1) Cross-Sequence Contrastive Learning via User Semantic Retrieval, which utilizes large language models (LLMs) to understand diverse user preferences and retrieve semantically similar users to form reliable positive samples through a learnable sample method; and (2) Intra-Sequence Contrastive Learning via Item Semantic Retrieval, which employs LLMs to comprehend items and retrieve similar items to perform semantic-based item substitution
arXiv Detail & Related papers (2025-03-06T07:25:19Z)
A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts.<n>With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS)<n>Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements.<n>High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z)
C2-CRS: Coarse-to-Fine Contrastive Learning for Conversational Recommender System [47.18484863699936]
We propose a novel contrastive learning framework to improve data semantic fusion for Conversational recommender systems. In our approach, we first extract and represent multi-grained semantic units from different data signals, and then align the associated multi-type semantic units in a coarse-to-fine way. Experiments on two public CRS datasets have demonstrated the effectiveness of our approach in both recommendation and conversation tasks.
arXiv Detail & Related papers (2022-01-04T11:39:41Z)
Leveraging Historical Interaction Data for Improving Conversational Recommender System [105.90963882850265]
We propose a novel pre-training approach to integrate item- and attribute-based preference sequence. Experiment results on two real-world datasets have demonstrated the effectiveness of our approach.
arXiv Detail & Related papers (2020-08-19T03:43:50Z)
Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion [77.21442487537139]
Conversational recommender systems (CRS) aim to recommend high-quality items to users through interactive conversations. First, the conversation data itself lacks of sufficient contextual information for accurately understanding users' preference. Second, there is a semantic gap between natural language expression and item-level user preference.
arXiv Detail & Related papers (2020-07-08T11:14:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.