Related papers: Fast Multi-Step Critiquing for VAE-based Recommender Systems

Fast Multi-Step Critiquing for VAE-based Recommender Systems

URL: http://arxiv.org/abs/2105.00774v1
Date: Mon, 3 May 2021 12:26:09 GMT
Title: Fast Multi-Step Critiquing for VAE-based Recommender Systems
Authors: Diego Antognini and Boi Faltings
Abstract summary: We present M&Ms-VAE, a novel variational autoencoder for recommendation and explanation. We train the model under a weak supervision scheme to simulate both fully and partially observed variables. We then leverage the generalization ability of a trained M&Ms-VAE model to embed the user preference and the critique separately.
Score: 27.207067974031805
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent studies have shown that providing personalized explanations alongside recommendations increases trust and perceived quality. Furthermore, it gives users an opportunity to refine the recommendations by critiquing parts of the explanations. On one hand, current recommender systems model the recommendation, explanation, and critiquing objectives jointly, but this creates an inherent trade-off between their respective performance. On the other hand, although recent latent linear critiquing approaches are built upon an existing recommender system, they suffer from computational inefficiency at inference due to the objective optimized at each conversation's turn. We address these deficiencies with M&Ms-VAE, a novel variational autoencoder for recommendation and explanation that is based on multimodal modeling assumptions. We train the model under a weak supervision scheme to simulate both fully and partially observed variables. Then, we leverage the generalization ability of a trained M&Ms-VAE model to embed the user preference and the critique separately. Our work's most important innovation is our critiquing module, which is built upon and trained in a self-supervised manner with a simple ranking objective. Experiments on four real-world datasets demonstrate that among state-of-the-art models, our system is the first to dominate or match the performance in terms of recommendation, explanation, and multi-step critiquing. Moreover, M&Ms-VAE processes the critiques up to 25.6x faster than the best baselines. Finally, we show that our model infers coherent joint and cross generation, even under weak supervision, thanks to our multimodal-based modeling and training scheme.

Related papers

Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning [45.16917994431658]
This paper proposes UnifiedReward-Think, the first unified multimodal CoT-based reward model.<n>We first use a small amount of image generation preference data to distill the reasoning process of GPT-4o.<n>We then prepare large-scale unified multimodal preference data to elicit the model's reasoning process across various vision tasks.
arXiv Detail & Related papers (2025-05-06T08:46:41Z)
Enhancing Recommendation Explanations through User-Centric Refinement [7.640281193938638]
We propose a novel paradigm that refines initial explanations generated by existing explainable recommender models. Specifically, we introduce a multi-agent collaborative refinement framework based on large language models.
arXiv Detail & Related papers (2025-02-17T12:08:18Z)
Reason4Rec: Large Language Models for Recommendation with Deliberative User Preference Alignment [69.11529841118671]
We propose a new Deliberative Recommendation task, which incorporates explicit reasoning about user preferences as an additional alignment goal. We then introduce the Reasoning-powered Recommender framework for deliberative user preference alignment.
arXiv Detail & Related papers (2025-02-04T07:17:54Z)
LLaVA-Critic: Learning to Evaluate Multimodal Models [110.06665155812162]
We introduce LLaVA-Critic, the first open-source large multimodal model (LMM) designed as a generalist evaluator. LLaVA-Critic is trained using a high-quality critic instruction-following dataset that incorporates diverse evaluation criteria and scenarios.
arXiv Detail & Related papers (2024-10-03T17:36:33Z)
Direct Judgement Preference Optimization [66.83088028268318]
We train large language models (LLMs) as generative judges to evaluate and critique other models' outputs. We employ three approaches to collect the preference pairs for different use cases, each aimed at improving our generative judge from a different perspective. Our model robustly counters inherent biases such as position and length bias, flexibly adapts to any evaluation protocol specified by practitioners, and provides helpful language feedback for improving downstream generator models.
arXiv Detail & Related papers (2024-09-23T02:08:20Z)
Self-Taught Evaluators [77.92610887220594]
We present an approach that aims to im-proves without human annotations, using synthetic training data only. Our Self-Taught Evaluator can improve a strong LLM from 75.4 to 88.3 on RewardBench.
arXiv Detail & Related papers (2024-08-05T17:57:02Z)
Efficient Model-agnostic Alignment via Bayesian Persuasion [13.42367964190663]
We introduce a model-agnostic and lightweight Bayesian Persuasion Alignment framework. In the persuasion process, the small model (Advisor) observes the information item (i.e., state) and persuades large models (Receiver) to elicit improved responses. We show that GPT-2 can significantly improve the performance of various models, achieving an average enhancement of 16.1% in mathematical reasoning ability and 13.7% in code generation.
arXiv Detail & Related papers (2024-05-29T02:57:07Z)
A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation [77.42486522565295]
We propose a novel recommendation approach called LSVCR to jointly conduct personalized video and comment recommendation. Our approach consists of two key components, namely sequential recommendation (SR) model and supplemental large language model (LLM) recommender. In particular, we achieve a significant overall gain of 4.13% in comment watch time.
arXiv Detail & Related papers (2024-03-20T13:14:29Z)
Prototypical Self-Explainable Models Without Re-training [5.837536154627278]
Self-explainable models (SEMs) are trained directly to provide explanations alongside their predictions. Current SEMs require complex architectures and heavily regularized loss functions, thus necessitating specific and costly training. We propose a simple yet efficient universal method called KMEx, which can convert any existing pre-trained model into a prototypical SEM.
arXiv Detail & Related papers (2023-12-13T01:15:00Z)
QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement. QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights. We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z)
Positive and Negative Critiquing for VAE-based Recommenders [39.38032088973816]
We propose M&Ms-VAE, which achieves state-of-the-art performance in terms of recommendation, explanation, and critiquing. M&Ms-VAE and similar models allow users to negatively critique (i.e., explicitly disagree) We address this deficiency with M&Ms-VAE+, an extension of M&Ms-VAE that enables positive and negative critiquing.
arXiv Detail & Related papers (2022-04-05T12:40:53Z)
Recommendation Fairness: From Static to Dynamic [12.080824433982993]
We discuss how fairness could be baked into reinforcement learning techniques for recommendation. We argue that in order to make further progress in recommendation fairness, we may want to consider multi-agent (game-theoretic) optimization, multi-objective (Pareto) optimization.
arXiv Detail & Related papers (2021-09-05T21:38:05Z)
Top-N Recommendation with Counterfactual User Preference Simulation [26.597102553608348]
Top-N recommendation, which aims to learn user ranking-based preference, has long been a fundamental problem in a wide range of applications. In this paper, we propose to reformulate the recommendation task within the causal inference framework to handle the data scarce problem.
arXiv Detail & Related papers (2021-09-02T14:28:46Z)
Self-Supervised Reinforcement Learning for Recommender Systems [77.38665506495553]
We propose self-supervised reinforcement learning for sequential recommendation tasks. Our approach augments standard recommendation models with two output layers: one for self-supervised learning and the other for RL. Based on such an approach, we propose two frameworks namely Self-Supervised Q-learning(SQN) and Self-Supervised Actor-Critic(SAC)
arXiv Detail & Related papers (2020-06-10T11:18:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.