A Comprehensive Survey of Evaluation Techniques for Recommendation
Systems
- URL: http://arxiv.org/abs/2312.16015v2
- Date: Fri, 12 Jan 2024 09:19:51 GMT
- Title: A Comprehensive Survey of Evaluation Techniques for Recommendation
Systems
- Authors: Aryan Jadon and Avinash Patil
- Abstract summary: This paper introduces a comprehensive suite of metrics, each tailored to capture a distinct aspect of system performance.
We identify the strengths and limitations of current evaluation practices and highlight the nuanced trade-offs that emerge when optimizing recommendation systems across different metrics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The effectiveness of recommendation systems is pivotal to user engagement and
satisfaction in online platforms. As these recommendation systems increasingly
influence user choices, their evaluation transcends mere technical performance
and becomes central to business success. This paper addresses the multifaceted
nature of recommendations system evaluation by introducing a comprehensive
suite of metrics, each tailored to capture a distinct aspect of system
performance. We discuss
* Similarity Metrics: to quantify the precision of content-based filtering
mechanisms and assess the accuracy of collaborative filtering techniques.
* Candidate Generation Metrics: to evaluate how effectively the system
identifies a broad yet relevant range of items.
* Predictive Metrics: to assess the accuracy of forecasted user preferences.
* Ranking Metrics: to evaluate the effectiveness of the order in which
recommendations are presented.
* Business Metrics: to align the performance of the recommendation system
with economic objectives.
Our approach emphasizes the contextual application of these metrics and their
interdependencies. In this paper, we identify the strengths and limitations of
current evaluation practices and highlight the nuanced trade-offs that emerge
when optimizing recommendation systems across different metrics. The paper
concludes by proposing a framework for selecting and interpreting these metrics
to not only improve system performance but also to advance business goals. This
work is to aid researchers and practitioners in critically assessing
recommendation systems and fosters the development of more nuanced, effective,
and economically viable personalization strategies. Our code is available at
GitHub -
https://github.com/aryan-jadon/Evaluation-Metrics-for-Recommendation-Systems.
Related papers
- Navigating the Evaluation Funnel to Optimize Iteration Speed for Recommender Systems [0.0]
We present a novel framework that simplifies the reasoning around the evaluation funnel for a recommendation system.
We show that decomposing the definition of success into smaller necessary criteria for success enables early identification of non-successful ideas.
We go through so-called offline and online evaluation methods such as counterfactual logging, validation, verification, A/B testing, and interleaving.
arXiv Detail & Related papers (2024-04-03T17:15:45Z) - RecRec: Algorithmic Recourse for Recommender Systems [41.97186998947909]
It is crucial for all stakeholders to understand the model's rationale behind making certain predictions and recommendations.
This is especially true for the content providers whose livelihoods depend on the recommender system.
We propose a recourse framework for recommender systems, targeted towards the content providers.
arXiv Detail & Related papers (2023-08-28T22:26:50Z) - Impression-Aware Recommender Systems [57.38537491535016]
Novel data sources bring new opportunities to improve the quality of recommender systems.
Researchers may use impressions to refine user preferences and overcome the current limitations in recommender systems research.
We present a systematic literature review on recommender systems using impressions.
arXiv Detail & Related papers (2023-08-15T16:16:02Z) - Bridging Offline-Online Evaluation with a Time-dependent and Popularity
Bias-free Offline Metric for Recommenders [3.130722489512822]
We show that penalizing popular items and considering the time of transactions significantly improves our ability to choose the best recommendation model for a live recommender system.
Our results aim to help the academic community to understand better offline evaluation and optimization criteria that are more relevant for real applications of recommender systems.
arXiv Detail & Related papers (2023-08-14T01:37:02Z) - User-Controllable Recommendation via Counterfactual Retrospective and
Prospective Explanations [96.45414741693119]
We present a user-controllable recommender system that seamlessly integrates explainability and controllability.
By providing both retrospective and prospective explanations through counterfactual reasoning, users can customize their control over the system.
arXiv Detail & Related papers (2023-08-02T01:13:36Z) - A Survey on Fairness-aware Recommender Systems [59.23208133653637]
We present concepts of fairness in different recommendation scenarios, comprehensively categorize current advances, and introduce typical methods to promote fairness in different stages of recommender systems.
Next, we delve into the significant influence that fairness-aware recommender systems exert on real-world industrial applications.
arXiv Detail & Related papers (2023-06-01T07:08:22Z) - A Review on Pushing the Limits of Baseline Recommendation Systems with
the integration of Opinion Mining & Information Retrieval Techniques [0.0]
Recommendation Systems allow users to identify trending items among a community while being timely and relevant to the user's expectations.
Deep Learning methods have been brought forward to achieve better quality recommendations.
Researchers have tried to expand on the capabilities of standard recommendation systems to provide the most effective recommendations.
arXiv Detail & Related papers (2022-05-03T22:13:33Z) - Measuring "Why" in Recommender Systems: a Comprehensive Survey on the
Evaluation of Explainable Recommendation [87.82664566721917]
This survey is based on more than 100 papers from top-tier conferences like IJCAI, AAAI, TheWebConf, Recsys, UMAP, and IUI.
arXiv Detail & Related papers (2022-02-14T02:58:55Z) - FEBR: Expert-Based Recommendation Framework for beneficial and
personalized content [77.86290991564829]
We propose FEBR (Expert-Based Recommendation Framework), an apprenticeship learning framework to assess the quality of the recommended content.
The framework exploits the demonstrated trajectories of an expert (assumed to be reliable) in a recommendation evaluation environment, to recover an unknown utility function.
We evaluate the performance of our solution through a user interest simulation environment (using RecSim)
arXiv Detail & Related papers (2021-07-17T18:21:31Z) - MARS-Gym: A Gym framework to model, train, and evaluate Recommender
Systems for Marketplaces [51.123916699062384]
MARS-Gym is an open-source framework to build and evaluate Reinforcement Learning agents for recommendations in marketplaces.
We provide the implementation of a diverse set of baseline agents, with a metrics-driven analysis of them in the Trivago marketplace dataset.
We expect to bridge the gap between academic research and production systems, as well as to facilitate the design of new algorithms and applications.
arXiv Detail & Related papers (2020-09-30T16:39:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.