Related papers: Review Helpfulness Scores vs. Review Unhelpfulness Scores: Two Sides of the Same Coin or Different Coins?

Review Helpfulness Scores vs. Review Unhelpfulness Scores: Two Sides of the Same Coin or Different Coins?

URL: http://arxiv.org/abs/2407.05207v1
Date: Wed, 24 Apr 2024 10:35:17 GMT
Title: Review Helpfulness Scores vs. Review Unhelpfulness Scores: Two Sides of the Same Coin or Different Coins?
Authors: Yinan Yu, Dominik Gutt, Warut Khern-am-nuai,
Abstract summary: We find that review unhelpfulness scores are not driven by intrinsic review characteristics. Users who receive review unhelpfulness votes are more likely to cast unhelpfulness votes for other reviews.
Score: 1.0738561302102214
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Evaluating the helpfulness of online reviews supports consumers who must sift through large volumes of online reviews. Online review platforms have increasingly adopted review evaluating systems, which let users evaluate whether reviews are helpful or not; in turn, these evaluations assist review readers and encourage review contributors. Although review helpfulness scores have been studied extensively in the literature, our knowledge regarding their counterpart, review unhelpfulness scores, is lacking. Addressing this gap in the literature is important because researchers and practitioners have assumed that unhelpfulness scores are driven by intrinsic review characteristics and that such scores are associated with low-quality reviews. This study validates this conventional wisdom by examining factors that influence unhelpfulness scores. We find that, unlike review helpfulness scores, unhelpfulness scores are generally not driven by intrinsic review characteristics, as almost none of them are statistically significant predictors of an unhelpfulness score. We also find that users who receive review unhelpfulness votes are more likely to cast unhelpfulness votes for other reviews. Finally, unhelpfulness voters engage much less with the platform than helpfulness voters do. In summary, our findings suggest that review unhelpfulness scores are not driven by intrinsic review characteristics. Therefore, helpfulness and unhelpfulness scores should not be considered as two sides of the same coin.

Related papers

LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews [74.87393214734114]
This work introduces LazyReview, a dataset of peer-review sentences annotated with fine-grained lazy thinking categories. Large Language Models (LLMs) struggle to detect these instances in a zero-shot setting. instruction-based fine-tuning on our dataset significantly boosts performance by 10-20 performance points.
arXiv Detail & Related papers (2025-04-15T10:07:33Z)
Can LLM feedback enhance review quality? A randomized study of 20K reviews at ICLR 2025 [115.86204862475864]
Review Feedback Agent provides automated feedback on vague comments, content misunderstandings, and unprofessional remarks to reviewers. It was implemented at ICLR 2025 as a large randomized control study. 27% of reviewers who received feedback updated their reviews, and over 12,000 feedback suggestions from the agent were incorporated by those reviewers.
arXiv Detail & Related papers (2025-04-13T22:01:25Z)
Identifying Aspects in Peer Reviews [61.374437855024844]
We develop a data-driven schema for deriving fine-grained aspects from a corpus of peer reviews. We introduce a dataset of peer reviews augmented with aspects and show how it can be used for community-level review analysis.
arXiv Detail & Related papers (2025-04-09T14:14:42Z)
Understanding and Supporting Peer Review Using AI-reframed Positive Summary [18.686807993563168]
This study explored the impact of appending an automatically generated positive summary to the peer reviews of a writing task. We found that adding an AI-reframed positive summary to otherwise harsh feedback increased authors' critique acceptance. We discuss the implications of using AI in peer feedback, focusing on how it can influence critique acceptance and support research communities.
arXiv Detail & Related papers (2025-03-13T11:22:12Z)
Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback. The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied. We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z)
A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence [58.6354685593418]
This paper proposes several article-level, field-normalized, and large language model-empowered bibliometric indicators to evaluate reviews. The newly emerging AI-generated literature reviews are also appraised. This work offers insights into the current challenges of literature reviews and envisions future directions for their development.
arXiv Detail & Related papers (2024-02-20T11:28:50Z)
Exploiting Correlated Auxiliary Feedback in Parameterized Bandits [56.84649080789685]
We study a novel variant of the parameterized bandits problem in which the learner can observe additional auxiliary feedback that is correlated with the observed reward. The auxiliary feedback is readily available in many real-life applications, e.g., an online platform that wants to recommend the best-rated services to its users can observe the user's rating of service (rewards) and collect additional information like service delivery time (auxiliary feedback)
arXiv Detail & Related papers (2023-11-05T17:27:06Z)
On the Role of Reviewer Expertise in Temporal Review Helpfulness Prediction [5.381004207943597]
Existing methods for identifying helpful reviews primarily focus on review text and ignore the two key factors of (1) who post the reviews and (2) when the reviews are posted. We introduce a dataset and develop a model that integrates the reviewer's expertise, derived from the past review history, and the temporal dynamics of the reviews to automatically assess review helpfulness.
arXiv Detail & Related papers (2023-02-22T23:41:22Z)
Integrating Rankings into Quantized Scores in Peer Review [61.27794774537103]
In peer review, reviewers are usually asked to provide scores for the papers. To mitigate this issue, conferences have started to ask reviewers to additionally provide a ranking of the papers they have reviewed. There are no standard procedure for using this ranking information and Area Chairs may use it in different ways. We take a principled approach to integrate the ranking information into the scores.
arXiv Detail & Related papers (2022-04-05T19:39:13Z)
Measuring "Why" in Recommender Systems: a Comprehensive Survey on the Evaluation of Explainable Recommendation [87.82664566721917]
This survey is based on more than 100 papers from top-tier conferences like IJCAI, AAAI, TheWebConf, Recsys, UMAP, and IUI.
arXiv Detail & Related papers (2022-02-14T02:58:55Z)
User and Item-aware Estimation of Review Helpfulness [4.640835690336653]
We investigate the role of deviations in the properties of reviews as helpfulness determinants. We propose a novel helpfulness estimation model that extends previous ones. Our model is thus an effective tool to select relevant user feedback for decision-making.
arXiv Detail & Related papers (2020-11-20T15:35:56Z)
ReviewRobot: Explainable Paper Review Generation based on Knowledge Synthesis [62.76038841302741]
We build a novel ReviewRobot to automatically assign a review score and write comments for multiple categories such as novelty and meaningful comparison. Experimental results show that our review score predictor reaches 71.4%-100% accuracy. Human assessment by domain experts shows that 41.7%-70.5% of the comments generated by ReviewRobot are valid and constructive, and better than human-written ones for 20% of the time.
arXiv Detail & Related papers (2020-10-13T02:17:58Z)
Understanding Peer Review of Software Engineering Papers [5.744593856232663]
We aim at understanding how reviewers, including those who have won awards for reviewing, perform their reviews of software engineering papers. The most important features of papers that result in positive reviews are clear and supported validation, an interesting problem, and novelty. Authors should make the contribution of the work very clear in their paper.
arXiv Detail & Related papers (2020-09-02T17:31:45Z)
How Useful are Reviews for Recommendation? A Critical Review and Potential Improvements [8.471274313213092]
We investigate a growing body of work that seeks to improve recommender systems through the use of review text. Our initial findings reveal several discrepancies in reported results, partly due to copying results across papers despite changes in experimental settings or data pre-processing. Further investigation calls for discussion on a much larger problem about the "importance" of user reviews for recommendation.
arXiv Detail & Related papers (2020-05-25T16:30:05Z)
Context-aware Helpfulness Prediction for Online Product Reviews [34.47368084659301]
We propose a neural deep learning model that predicts the helpfulness score of a review. This model is based on convolutional neural network (CNN) and a context-aware encoding mechanism. We validated our model on human annotated dataset and the result shows that our model significantly outperforms existing models for helpfulness prediction.
arXiv Detail & Related papers (2020-04-27T18:19:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.