Evaluating the Robustness of Conversational Recommender Systems by
Adversarial Examples
- URL: http://arxiv.org/abs/2303.05575v1
- Date: Thu, 9 Mar 2023 20:51:18 GMT
- Title: Evaluating the Robustness of Conversational Recommender Systems by
Adversarial Examples
- Authors: Ali Montazeralghaem and James Allan
- Abstract summary: We propose an adversarial evaluation scheme including four scenarios in two categories.
We generate adversarial examples to evaluate the robustness of these systems in the face of different input data.
Our results show that none of these systems are robust and reliable to the adversarial examples.
- Score: 16.49836195831763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational recommender systems (CRSs) are improving rapidly, according to
the standard recommendation accuracy metrics. However, it is essential to make
sure that these systems are robust in interacting with users including regular
and malicious users who want to attack the system by feeding the system
modified input data. In this paper, we propose an adversarial evaluation scheme
including four scenarios in two categories and automatically generate
adversarial examples to evaluate the robustness of these systems in the face of
different input data. By executing these adversarial examples we can compare
the ability of different conversational recommender systems to satisfy the
user's preferences. We evaluate three CRSs by the proposed adversarial examples
on two datasets. Our results show that none of these systems are robust and
reliable to the adversarial examples.
Related papers
- Search-Based Interaction For Conversation Recommendation via Generative Reward Model Based Simulated User [117.82681846559909]
Conversational recommendation systems (CRSs) use multi-turn interaction to capture user preferences and provide personalized recommendations.
We propose a generative reward model based simulated user, named GRSU, for automatic interaction with CRSs.
arXiv Detail & Related papers (2025-04-29T06:37:30Z) - Towards Robust Offline Evaluation: A Causal and Information Theoretic Framework for Debiasing Ranking Systems [6.540293515339111]
offline evaluation of retrieval-ranking systems is crucial for developing high-performing models.
We propose a novel framework for robust offline evaluation of retrieval-ranking systems.
Our contributions include (1) a causal formulation for addressing offline evaluation biases, (2) a system-agnostic debiasing framework, and (3) empirical validation of its effectiveness.
arXiv Detail & Related papers (2025-04-04T23:52:57Z) - An Efficient Multi-threaded Collaborative Filtering Approach in Recommendation System [0.0]
This research focuses on building a scalable recommendation system capable of handling numerous users efficiently.
A multithreaded similarity approach is employed to achieve this, where users are divided into independent threads that run in parallel.
This parallelization significantly reduces computation time compared to traditional methods, resulting in a faster, more efficient, and scalable recommendation system.
arXiv Detail & Related papers (2024-09-28T06:33:18Z) - A Unified Causal Framework for Auditing Recommender Systems for Ethical Concerns [40.793466500324904]
We view recommender system auditing from a causal lens and provide a general recipe for defining auditing metrics.
Under this general causal auditing framework, we categorize existing auditing metrics and identify gaps in them.
We propose two classes of such metrics:future- and past-reacheability and stability, that measure the ability of a user to influence their own and other users' recommendations.
arXiv Detail & Related papers (2024-09-20T04:37:36Z) - Revisiting Reciprocal Recommender Systems: Metrics, Formulation, and Method [60.364834418531366]
We propose five new evaluation metrics that comprehensively and accurately assess the performance of RRS.
We formulate the RRS from a causal perspective, formulating recommendations as bilateral interventions.
We introduce a reranking strategy to maximize matching outcomes, as measured by the proposed metrics.
arXiv Detail & Related papers (2024-08-19T07:21:02Z) - System-2 Recommenders: Disentangling Utility and Engagement in Recommendation Systems via Temporal Point-Processes [80.97898201876592]
We propose a generative model in which past content interactions impact the arrival rates of users based on a self-exciting Hawkes process.
We show analytically that given samples it is possible to disentangle System-1 and System-2 and allow content optimization based on user utility.
arXiv Detail & Related papers (2024-05-29T18:19:37Z) - User-Controllable Recommendation via Counterfactual Retrospective and
Prospective Explanations [96.45414741693119]
We present a user-controllable recommender system that seamlessly integrates explainability and controllability.
By providing both retrospective and prospective explanations through counterfactual reasoning, users can customize their control over the system.
arXiv Detail & Related papers (2023-08-02T01:13:36Z) - Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation [17.41434948048325]
We conduct an interactive user study to unveil how vulnerable TOD systems are against realistic scenarios.
Our study reveals that conversations in open-goal settings lead to catastrophic failures of the system.
We discover a novel "pretending" behavior, in which the system pretends to handle the user requests even though they are beyond the system's capabilities.
arXiv Detail & Related papers (2023-05-23T09:24:53Z) - Re-Examining System-Level Correlations of Automatic Summarization
Evaluation Metrics [64.81682222169113]
How reliably an automatic summarization evaluation metric replicates human judgments of summary quality is quantified by system-level correlations.
We identify two ways in which the definition of the system-level correlation is inconsistent with how metrics are used to evaluate systems in practice.
arXiv Detail & Related papers (2022-04-21T15:52:14Z) - Membership Inference Attacks Against Recommender Systems [33.66394989281801]
We make the first attempt on quantifying the privacy leakage of recommender systems through the lens of membership inference.
Our attack is on the user-level but not on the data sample-level.
A shadow recommender is established to derive the labeled training data for training the attack model.
arXiv Detail & Related papers (2021-09-16T15:19:19Z) - Improving Conversational Question Answering Systems after Deployment
using Feedback-Weighted Learning [69.42679922160684]
We propose feedback-weighted learning based on importance sampling to improve upon an initial supervised system using binary user feedback.
Our work opens the prospect to exploit interactions with real users and improve conversational systems after deployment.
arXiv Detail & Related papers (2020-11-01T19:50:34Z) - PONE: A Novel Automatic Evaluation Metric for Open-Domain Generative
Dialogue Systems [48.99561874529323]
There are three kinds of automatic methods to evaluate the open-domain generative dialogue systems.
Due to the lack of systematic comparison, it is not clear which kind of metrics are more effective.
We propose a novel and feasible learning-based metric that can significantly improve the correlation with human judgments.
arXiv Detail & Related papers (2020-04-06T04:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.