Hierarchical Conversational Preference Elicitation with Bandit Feedback
- URL: http://arxiv.org/abs/2209.06129v1
- Date: Tue, 6 Sep 2022 05:35:24 GMT
- Title: Hierarchical Conversational Preference Elicitation with Bandit Feedback
- Authors: Jinhang Zuo, Songwen Hu, Tong Yu, Shuai Li, Handong Zhao, Carlee
Joe-Wong
- Abstract summary: We formulate a new conversational bandit problem that allows the recommender system to choose either a key-term or an item to recommend at each round.
We conduct a survey and analyze a real-world dataset to find that, unlike assumptions made in prior works, key-term rewards are mainly affected by rewards of representative items.
We propose two bandit algorithms, Hier-UCB and Hier-LinUCB, that leverage this observed relationship and the hierarchical structure between key-terms and items.
- Score: 36.507341041113825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recent advances of conversational recommendations provide a promising way
to efficiently elicit users' preferences via conversational interactions. To
achieve this, the recommender system conducts conversations with users, asking
their preferences for different items or item categories. Most existing
conversational recommender systems for cold-start users utilize a multi-armed
bandit framework to learn users' preference in an online manner. However, they
rely on a pre-defined conversation frequency for asking about item categories
instead of individual items, which may incur excessive conversational
interactions that hurt user experience. To enable more flexible questioning
about key-terms, we formulate a new conversational bandit problem that allows
the recommender system to choose either a key-term or an item to recommend at
each round and explicitly models the rewards of these actions. This motivates
us to handle a new exploration-exploitation (EE) trade-off between key-term
asking and item recommendation, which requires us to accurately model the
relationship between key-term and item rewards. We conduct a survey and analyze
a real-world dataset to find that, unlike assumptions made in prior works,
key-term rewards are mainly affected by rewards of representative items. We
propose two bandit algorithms, Hier-UCB and Hier-LinUCB, that leverage this
observed relationship and the hierarchical structure between key-terms and
items to efficiently learn which items to recommend. We theoretically prove
that our algorithm can reduce the regret bound's dependency on the total number
of items from previous work. We validate our proposed algorithms and regret
bound on both synthetic and real-world data.
Related papers
- Conversational Dueling Bandits in Generalized Linear Models [45.99797764214125]
We introduce relative feedback-based conversations into conversational recommendation systems.
We propose a novel conversational dueling bandit algorithm called ConDuel.
We also demonstrate the potential to extend our algorithm to multinomial logit bandits with theoretical and experimental guarantees.
arXiv Detail & Related papers (2024-07-26T03:43:10Z) - Modeling Multiple User Interests using Hierarchical Knowledge for
Conversational Recommender System [13.545276171601769]
A conversational recommender system (CRS) is a practical application for item recommendation through natural language conversation.
We propose to model such multiple user interests in CRS.
We investigated its effects in experiments using the ReDial dataset and found that the proposed method can recommend a wider variety of items than that of the baseline CR-Walker.
arXiv Detail & Related papers (2023-03-01T08:15:48Z) - Talk the Walk: Synthetic Data Generation for Conversational Music
Recommendation [62.019437228000776]
We present TalkWalk, which generates realistic high-quality conversational data by leveraging encoded expertise in widely available item collections.
We generate over one million diverse conversations in a human-collected dataset.
arXiv Detail & Related papers (2023-01-27T01:54:16Z) - COLA: Improving Conversational Recommender Systems by Collaborative
Augmentation [9.99763097964222]
We propose a collaborative augmentation (COLA) method to improve both item representation learning and user preference modeling.
We construct an interactive user-item graph from all conversations, which augments item representations with user-aware information.
To improve user preference modeling, we retrieve similar conversations from the training corpus, where the involved items and attributes that reflect the user's potential interests are used to augment the user representation.
arXiv Detail & Related papers (2022-12-15T12:37:28Z) - Comparison-based Conversational Recommender System with Relative Bandit
Feedback [15.680698037463488]
We propose a novel comparison-based conversational recommender system.
We propose a new bandit algorithm, which we call RelativeConUCB.
The experiments on both synthetic and real-world datasets validate the advantage of our proposed method.
arXiv Detail & Related papers (2022-08-21T08:05:46Z) - Soliciting User Preferences in Conversational Recommender Systems via
Usage-related Questions [21.184555512370093]
We propose a novel approach to preference elicitation by asking implicit questions based on item usage.
First, we identify the sentences from a large review corpus that contain information about item usage.
Then, we generate implicit preference elicitation questions from those sentences using a neural text-to-text model.
arXiv Detail & Related papers (2021-11-26T12:23:14Z) - Learning to Ask Appropriate Questions in Conversational Recommendation [49.31942688227828]
We propose the Knowledge-Based Question Generation System (KBQG), a novel framework for conversational recommendation.
KBQG models a user's preference in a finer granularity by identifying the most relevant relations from a structured knowledge graph.
Finially, accurate recommendations can be generated in fewer conversational turns.
arXiv Detail & Related papers (2021-05-11T03:58:10Z) - Seamlessly Unifying Attributes and Items: Conversational Recommendation
for Cold-Start Users [111.28351584726092]
We consider the conversational recommendation for cold-start users, where a system can both ask the attributes from and recommend items to a user interactively.
Our Conversational Thompson Sampling (ConTS) model holistically solves all questions in conversational recommendation by choosing the arm with the maximal reward to play.
arXiv Detail & Related papers (2020-05-23T08:56:37Z) - A Bayesian Approach to Conversational Recommendation Systems [60.12942570608859]
We present a conversational recommendation system based on a Bayesian approach.
A case study based on the application of this approach to emphstagend.com, an online platform for booking entertainers, is discussed.
arXiv Detail & Related papers (2020-02-12T15:59:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.