A Comprehensive Survey of Evaluation Techniques for Recommendation
Systems
- URL: http://arxiv.org/abs/2312.16015v2
- Date: Fri, 12 Jan 2024 09:19:51 GMT
- Title: A Comprehensive Survey of Evaluation Techniques for Recommendation
Systems
- Authors: Aryan Jadon and Avinash Patil
- Abstract summary: This paper introduces a comprehensive suite of metrics, each tailored to capture a distinct aspect of system performance.
We identify the strengths and limitations of current evaluation practices and highlight the nuanced trade-offs that emerge when optimizing recommendation systems across different metrics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The effectiveness of recommendation systems is pivotal to user engagement and
satisfaction in online platforms. As these recommendation systems increasingly
influence user choices, their evaluation transcends mere technical performance
and becomes central to business success. This paper addresses the multifaceted
nature of recommendations system evaluation by introducing a comprehensive
suite of metrics, each tailored to capture a distinct aspect of system
performance. We discuss
* Similarity Metrics: to quantify the precision of content-based filtering
mechanisms and assess the accuracy of collaborative filtering techniques.
* Candidate Generation Metrics: to evaluate how effectively the system
identifies a broad yet relevant range of items.
* Predictive Metrics: to assess the accuracy of forecasted user preferences.
* Ranking Metrics: to evaluate the effectiveness of the order in which
recommendations are presented.
* Business Metrics: to align the performance of the recommendation system
with economic objectives.
Our approach emphasizes the contextual application of these metrics and their
interdependencies. In this paper, we identify the strengths and limitations of
current evaluation practices and highlight the nuanced trade-offs that emerge
when optimizing recommendation systems across different metrics. The paper
concludes by proposing a framework for selecting and interpreting these metrics
to not only improve system performance but also to advance business goals. This
work is to aid researchers and practitioners in critically assessing
recommendation systems and fosters the development of more nuanced, effective,
and economically viable personalization strategies. Our code is available at
GitHub -
https://github.com/aryan-jadon/Evaluation-Metrics-for-Recommendation-Systems.
Related papers
- Online and Offline Evaluations of Collaborative Filtering and Content Based Recommender Systems [0.0]
This study provides a comparative analysis of a large-scale recommender system operating in Iran.
The system employs user-based and item-based recommendations using content-based, collaborative filtering, trend-based methods, and hybrid approaches.
Our methods of evaluation include manual evaluation, offline tests including accuracy and ranking metrics like hit-rate@k and nDCG, and online tests consisting of click-through rate (CTR)
arXiv Detail & Related papers (2024-11-02T20:05:31Z) - Pessimistic Evaluation [58.736490198613154]
We argue that evaluating information access systems assumes utilitarian values not aligned with traditions of information access based on equal access.
We advocate for pessimistic evaluation of information access systems focusing on worst case utility.
arXiv Detail & Related papers (2024-10-17T15:40:09Z) - Quantifying User Coherence: A Unified Framework for Cross-Domain Recommendation Analysis [69.37718774071793]
This paper introduces novel information-theoretic measures for understanding recommender systems.
We evaluate 7 recommendation algorithms across 9 datasets, revealing the relationships between our measures and standard performance metrics.
arXiv Detail & Related papers (2024-10-03T13:02:07Z) - A Unified Causal Framework for Auditing Recommender Systems for Ethical Concerns [40.793466500324904]
We view recommender system auditing from a causal lens and provide a general recipe for defining auditing metrics.
Under this general causal auditing framework, we categorize existing auditing metrics and identify gaps in them.
We propose two classes of such metrics:future- and past-reacheability and stability, that measure the ability of a user to influence their own and other users' recommendations.
arXiv Detail & Related papers (2024-09-20T04:37:36Z) - Revisiting Reciprocal Recommender Systems: Metrics, Formulation, and Method [60.364834418531366]
We propose five new evaluation metrics that comprehensively and accurately assess the performance of RRS.
We formulate the RRS from a causal perspective, formulating recommendations as bilateral interventions.
We introduce a reranking strategy to maximize matching outcomes, as measured by the proposed metrics.
arXiv Detail & Related papers (2024-08-19T07:21:02Z) - Review-based Recommender Systems: A Survey of Approaches, Challenges and Future Perspectives [11.835903510784735]
Review-based recommender systems have emerged as a significant sub-field in this domain.
We present a categorization of these systems and summarize the state-of-the-art methods, analyzing their unique features, effectiveness, and limitations.
We propose potential directions for future research, including the integration of multimodal data, multi-criteria rating information, and ethical considerations.
arXiv Detail & Related papers (2024-05-09T05:45:18Z) - Bridging Offline-Online Evaluation with a Time-dependent and Popularity
Bias-free Offline Metric for Recommenders [3.130722489512822]
We show that penalizing popular items and considering the time of transactions significantly improves our ability to choose the best recommendation model for a live recommender system.
Our results aim to help the academic community to understand better offline evaluation and optimization criteria that are more relevant for real applications of recommender systems.
arXiv Detail & Related papers (2023-08-14T01:37:02Z) - A Survey on Fairness-aware Recommender Systems [59.23208133653637]
We present concepts of fairness in different recommendation scenarios, comprehensively categorize current advances, and introduce typical methods to promote fairness in different stages of recommender systems.
Next, we delve into the significant influence that fairness-aware recommender systems exert on real-world industrial applications.
arXiv Detail & Related papers (2023-06-01T07:08:22Z) - Measuring "Why" in Recommender Systems: a Comprehensive Survey on the
Evaluation of Explainable Recommendation [87.82664566721917]
This survey is based on more than 100 papers from top-tier conferences like IJCAI, AAAI, TheWebConf, Recsys, UMAP, and IUI.
arXiv Detail & Related papers (2022-02-14T02:58:55Z) - FEBR: Expert-Based Recommendation Framework for beneficial and
personalized content [77.86290991564829]
We propose FEBR (Expert-Based Recommendation Framework), an apprenticeship learning framework to assess the quality of the recommended content.
The framework exploits the demonstrated trajectories of an expert (assumed to be reliable) in a recommendation evaluation environment, to recover an unknown utility function.
We evaluate the performance of our solution through a user interest simulation environment (using RecSim)
arXiv Detail & Related papers (2021-07-17T18:21:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.