Talk the Walk: Synthetic Data Generation for Conversational Music
Recommendation
- URL: http://arxiv.org/abs/2301.11489v3
- Date: Sat, 18 Nov 2023 04:26:35 GMT
- Title: Talk the Walk: Synthetic Data Generation for Conversational Music
Recommendation
- Authors: Megan Leszczynski, Shu Zhang, Ravi Ganti, Krisztian Balog, Filip
Radlinski, Fernando Pereira, Arun Tejasvi Chaganty
- Abstract summary: We present TalkWalk, which generates realistic high-quality conversational data by leveraging encoded expertise in widely available item collections.
We generate over one million diverse conversations in a human-collected dataset.
- Score: 62.019437228000776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recommender systems are ubiquitous yet often difficult for users to control,
and adjust if recommendation quality is poor. This has motivated conversational
recommender systems (CRSs), with control provided through natural language
feedback. However, as with most application domains, building robust CRSs
requires training data that reflects system usage$\unicode{x2014}$here
conversations with user utterances paired with items that cover a wide range of
preferences. This has proved challenging to collect scalably using conventional
methods. We address the question of whether it can be generated synthetically,
building on recent advances in natural language. We evaluate in the setting of
item set recommendation, noting the increasing attention to this task motivated
by use cases like music, news, and recipe recommendation. We present
TalkTheWalk, which synthesizes realistic high-quality conversational data by
leveraging domain expertise encoded in widely available curated item
collections, generating a sequence of hypothetical yet plausible item sets,
then using a language model to produce corresponding user utterances. We
generate over one million diverse playlist curation conversations in the music
domain, and show these contain consistent utterances with relevant item sets
nearly matching the quality of an existing but small human-collected dataset
for this task. We demonstrate the utility of the generated synthetic dataset on
a conversational item retrieval task and show that it improves over both
unsupervised baselines and systems trained on a real dataset.
Related papers
- Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition [48.527630771422935]
We propose a synthetic data generation pipeline for multi-speaker conversational ASR.
We conduct evaluation by fine-tuning the Whisper ASR model for telephone and distant conversational speech settings.
arXiv Detail & Related papers (2024-08-17T14:47:05Z) - Retrieval-Augmented Conversational Recommendation with Prompt-based Semi-Structured Natural Language State Tracking [16.37636420517529]
Large language models (LLMs) let us unlock the commonsense connections between user preference utterances and complex language in user-generated reviews.
RA-Rec is a dialogue state tracking system for ConvRec, showcased with a video, open source GitHub repository, and interactive Google Colab notebook.
arXiv Detail & Related papers (2024-05-25T15:41:26Z) - Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning.
Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance.
We handle the task of conversation retrieval based on text summaries of the conversations.
A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z) - Parameter-Efficient Conversational Recommender System as a Language
Processing Task [52.47087212618396]
Conversational recommender systems (CRS) aim to recommend relevant items to users by eliciting user preference through natural language conversation.
Prior work often utilizes external knowledge graphs for items' semantic information, a language model for dialogue generation, and a recommendation module for ranking relevant items.
In this paper, we represent items in natural language and formulate CRS as a natural language processing task.
arXiv Detail & Related papers (2024-01-25T14:07:34Z) - AUGUST: an Automatic Generation Understudy for Synthesizing
Conversational Recommendation Datasets [56.052803235932686]
We propose a novel automatic dataset synthesis approach that can generate both large-scale and high-quality recommendation dialogues.
In doing so, we exploit: (i) rich personalized user profiles from traditional recommendation datasets, (ii) rich external knowledge from knowledge graphs, and (iii) the conversation ability contained in human-to-human conversational recommendation datasets.
arXiv Detail & Related papers (2023-06-16T05:27:14Z) - Conversational Recommendation as Retrieval: A Simple, Strong Baseline [4.737923227003888]
Conversational recommendation systems (CRS) aim to recommend suitable items to users through natural language conversation.
Most CRS approaches do not effectively utilize the signal provided by these conversations.
We propose an alternative information retrieval (IR)-styled approach to the CRS item recommendation task.
arXiv Detail & Related papers (2023-05-23T06:21:31Z) - Beyond Single Items: Exploring User Preferences in Item Sets with the
Conversational Playlist Curation Dataset [20.42354123651454]
We call this task conversational item set curation.
We present a novel data collection methodology that efficiently collects realistic preferences about item sets in a conversational setting.
We show that it leads raters to express preferences that would not be otherwise expressed.
arXiv Detail & Related papers (2023-03-13T00:39:04Z) - PLACES: Prompting Language Models for Social Conversation Synthesis [103.94325597273316]
We use a small set of expert-written conversations as in-context examples to synthesize a social conversation dataset using prompting.
We perform several thorough evaluations of our synthetic conversations compared to human-collected conversations.
arXiv Detail & Related papers (2023-02-07T05:48:16Z) - COLA: Improving Conversational Recommender Systems by Collaborative
Augmentation [9.99763097964222]
We propose a collaborative augmentation (COLA) method to improve both item representation learning and user preference modeling.
We construct an interactive user-item graph from all conversations, which augments item representations with user-aware information.
To improve user preference modeling, we retrieve similar conversations from the training corpus, where the involved items and attributes that reflect the user's potential interests are used to augment the user representation.
arXiv Detail & Related papers (2022-12-15T12:37:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.