Related papers: Towards a Signal Detection Based Measure for Assessing Information Quality of Explainable Recommender Systems

Towards a Signal Detection Based Measure for Assessing Information Quality of Explainable Recommender Systems

URL: http://arxiv.org/abs/2507.01168v1
Date: Tue, 01 Jul 2025 20:11:17 GMT
Title: Towards a Signal Detection Based Measure for Assessing Information Quality of Explainable Recommender Systems
Authors: Yeonbin Son, Matthew L. Bolton,
Abstract summary: We develop an objective metric to evaluate Veracity: the information quality of explanations.<n>To assess the effectiveness of our proposed metric, we set up four cases with varying levels of information quality.
Score: 0.5371337604556311
License: http://creativecommons.org/licenses/by/4.0/
Abstract: There is growing interest in explainable recommender systems that provide recommendations along with explanations for the reasoning behind them. When evaluating recommender systems, most studies focus on overall recommendation performance. Only a few assess the quality of the explanations. Explanation quality is often evaluated through user studies that subjectively gather users' opinions on representative explanatory factors that shape end-users' perspective towards the results, not about the explanation contents itself. We aim to fill this gap by developing an objective metric to evaluate Veracity: the information quality of explanations. Specifically, we decompose Veracity into two dimensions: Fidelity and Attunement. Fidelity refers to whether the explanation includes accurate information about the recommended item. Attunement evaluates whether the explanation reflects the target user's preferences. By applying signal detection theory, we first determine decision outcomes for each dimension and then combine them to calculate a sensitivity, which serves as the final Veracity value. To assess the effectiveness of the proposed metric, we set up four cases with varying levels of information quality to validate whether our metric can accurately capture differences in quality. The results provided meaningful insights into the effectiveness of our proposed metric.

Related papers

Whom do Explanations Serve? A Systematic Literature Survey of User Characteristics in Explainable Recommender Systems Evaluation [7.021274080378664]
We surveyed 124 papers in which recommender systems explanations were evaluated in user studies.<n>Our findings suggest that the results from the surveyed studies predominantly cover specific users.<n>We recommend actions to move toward a more inclusive and reproducible evaluation.
arXiv Detail & Related papers (2024-12-12T13:01:30Z)
Learning Recommender Systems with Soft Target: A Decoupled Perspective [49.83787742587449]
We propose a novel decoupled soft label optimization framework to consider the objectives as two aspects by leveraging soft labels. We present a sensible soft-label generation algorithm that models a label propagation algorithm to explore users' latent interests in unobserved feedback via neighbors.
arXiv Detail & Related papers (2024-10-09T04:20:15Z)
Quantifying User Coherence: A Unified Framework for Cross-Domain Recommendation Analysis [69.37718774071793]
This paper introduces novel information-theoretic measures for understanding recommender systems. We evaluate 7 recommendation algorithms across 9 datasets, revealing the relationships between our measures and standard performance metrics.
arXiv Detail & Related papers (2024-10-03T13:02:07Z)
Creating Healthy Friction: Determining Stakeholder Requirements of Job Recommendation Explanations [2.373992571236766]
We evaluate an explainable job recommender system using a realistic, task-based, mixed-design user study. We find that providing stakeholders with real explanations does not significantly improve decision-making speed and accuracy.
arXiv Detail & Related papers (2024-09-24T11:03:17Z)
Explaining Length Bias in LLM-Based Preference Evaluations [51.07275977870145]
We decompose the preference evaluation metric, specifically the win rate, into two key components: desirability and information mass.<n>We show that response length impacts evaluations by influencing information mass.<n>We propose AdapAlpaca, a simple yet effective adjustment to win rate measurement.
arXiv Detail & Related papers (2024-07-01T08:37:41Z)
Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback. The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied. We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z)
Can Offline Metrics Measure Explanation Goals? A Comparative Survey Analysis of Offline Explanation Metrics in Recommender Systems [5.634769877793363]
Explanations in a Recommender System (RS) provide reasons for recommendations to users and can enhance transparency, persuasiveness, engagement, and trust-known as explanation goals.<n>We investigated whether, in explanations connecting interacted and recommended items based on shared content, the selection of item attributes and interacted items affects explanation goals.<n>Metrics measuring the diversity and popularity of attributes and the recency of item interactions were used to evaluate explanations from three state-of-the-art algorithms across six recommendation systems.
arXiv Detail & Related papers (2023-10-22T18:22:35Z)
Explainable Recommender with Geometric Information Bottleneck [25.703872435370585]
We propose to incorporate a geometric prior learnt from user-item interactions into a variational network. Latent factors from an individual user-item pair can be used for both recommendation and explanation generation. Experimental results on three e-commerce datasets show that our model significantly improves the interpretability of a variational recommender.
arXiv Detail & Related papers (2023-05-09T10:38:36Z)
Measuring "Why" in Recommender Systems: a Comprehensive Survey on the Evaluation of Explainable Recommendation [87.82664566721917]
This survey is based on more than 100 papers from top-tier conferences like IJCAI, AAAI, TheWebConf, Recsys, UMAP, and IUI.
arXiv Detail & Related papers (2022-02-14T02:58:55Z)
From Intrinsic to Counterfactual: On the Explainability of Contextualized Recommender Systems [43.93801836660617]
We show that by utilizing the contextual features (e.g., item reviews from users), we can design a series of explainable recommender systems. We propose three types of explainable recommendation strategies with gradual change of model transparency: whitebox, graybox, and blackbox. Our model achieves highly competitive ranking performance, and generates accurate and effective explanations in terms of numerous quantitative metrics and qualitative visualizations.
arXiv Detail & Related papers (2021-10-28T01:54:04Z)
Fairness-Aware Explainable Recommendation over Knowledge Graphs [73.81994676695346]
We analyze different groups of users according to their level of activity, and find that bias exists in recommendation performance between different groups. We show that inactive users may be more susceptible to receiving unsatisfactory recommendations, due to insufficient training data for the inactive users. We propose a fairness constrained approach via re-ranking to mitigate this problem in the context of explainable recommendation over knowledge graphs.
arXiv Detail & Related papers (2020-06-03T05:04:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.