Analytical and Empirical Study of Herding Effects in Recommendation Systems
- URL: http://arxiv.org/abs/2408.10895v1
- Date: Tue, 20 Aug 2024 14:29:23 GMT
- Title: Analytical and Empirical Study of Herding Effects in Recommendation Systems
- Authors: Hong Xie, Mingze Zhong, Defu Lian, Zhen Wang, Enhong Chen,
- Abstract summary: We study how to manage product ratings via rating aggregation rules and shortlisted representative reviews.
We show that proper recency aware rating aggregation rules can improve the speed of convergence in Amazon and TripAdvisor.
- Score: 72.6693986712978
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online rating systems are often used in numerous web or mobile applications, e.g., Amazon and TripAdvisor, to assess the ground-truth quality of products. Due to herding effects, the aggregation of historical ratings (or historical collective opinion) can significantly influence subsequent ratings, leading to misleading and erroneous assessments. We study how to manage product ratings via rating aggregation rules and shortlisted representative reviews, for the purpose of correcting the assessment error. We first develop a mathematical model to characterize important factors of herding effects in product ratings. We then identify sufficient conditions (via the stochastic approximation theory), under which the historical collective opinion converges to the ground-truth collective opinion of the whole user population. These conditions identify a class of rating aggregation rules and review selection mechanisms that can reveal the ground-truth product quality. We also quantify the speed of convergence (via the martingale theory), which reflects the efficiency of rating aggregation rules and review selection mechanisms. We prove that the herding effects slow down the speed of convergence while an accurate review selection mechanism can speed it up. We also study the speed of convergence numerically and reveal trade-offs in selecting rating aggregation rules and review selection mechanisms. To show the utility of our framework, we design a maximum likelihood algorithm to infer model parameters from ratings, and conduct experiments on rating datasets from Amazon and TripAdvisor. We show that proper recency aware rating aggregation rules can improve the speed of convergence in Amazon and TripAdvisor by 41% and 62% respectively.
Related papers
- Gaming the Judge: Unfaithful Chain-of-Thought Can Undermine Agent Evaluation [76.5533899503582]
Large language models (LLMs) are increasingly used as judges to evaluate agent performance.<n>We show this paradigm implicitly assumes that the agent's chain-of-thought (CoT) reasoning faithfully reflects both its internal reasoning and the underlying environment state.<n>We demonstrate that manipulated reasoning alone can inflate false positive rates of state-of-the-art VLM judges by up to 90% across 800 trajectories spanning diverse web tasks.
arXiv Detail & Related papers (2026-01-21T06:07:43Z) - Front-Loaded or Balanced? The Mechanism through Which Review Order Affects Overall Ratings in Premium Service Settings [10.304137762077897]
This research reveals the psychological mechanisms through which evaluation order affects consumer ratings via cognitive and affective pathways.<n>Three experiments demonstrate that in high-quality service contexts, a rating-first (vs. review-first) interface significantly elevates consumers' overall ratings.
arXiv Detail & Related papers (2025-11-25T03:12:30Z) - Beyond Means: A Dynamic Framework for Predicting Customer Satisfaction [29.75950401212671]
We demonstrate the value of using the Gaussian process (GP) framework for rating aggregation.<n>Based on 121,123 Yelp ratings, we compare the predictive power of different rating aggregation methods in predicting future ratings.<n>Our findings have important implications for marketing practitioners and customers.
arXiv Detail & Related papers (2025-11-18T18:43:29Z) - Automatic Reviewers Fail to Detect Faulty Reasoning in Research Papers: A New Counterfactual Evaluation Framework [55.078301794183496]
We focus on a core reviewing skill that underpins high-quality peer review: detecting faulty research logic.<n>This involves evaluating the internal consistency between a paper's results, interpretations, and claims.<n>We present a fully automated counterfactual evaluation framework that isolates and tests this skill under controlled conditions.
arXiv Detail & Related papers (2025-08-29T08:48:00Z) - Pairwise or Pointwise? Evaluating Feedback Protocols for Bias in LLM-Based Evaluation [57.380464382910375]
We show that the choice of feedback protocol can significantly affect evaluation reliability and induce systematic biases.
In particular, we show that pairwise evaluation protocols are more vulnerable to distracted evaluation.
arXiv Detail & Related papers (2025-04-20T19:05:59Z) - Mitigating the Participation Bias by Balancing Extreme Ratings [3.5785450878667597]
We consider a robust rating aggregation task under the participation bias.
Our goal is to minimize the expected squared loss between the aggregated ratings and the average of all underlying ratings.
arXiv Detail & Related papers (2025-02-06T02:58:46Z) - Paper Quality Assessment based on Individual Wisdom Metrics from Open Peer Review [4.35783648216893]
Traditional closed peer review systems are slow, costly, non-transparent, and possibly subject to biases.<n>We propose and examine the efficacy and accuracy of an alternative form of scientific peer review: through an open, bottom-up process.
arXiv Detail & Related papers (2025-01-22T17:00:27Z) - Fighting Sampling Bias: A Framework for Training and Evaluating Credit Scoring Models [2.918530881730374]
This paper addresses the adverse effect of sampling bias on model training and evaluation.
We propose bias-aware self-learning and a reject inference framework for scorecard evaluation.
Our results suggest a profit improvement of about eight percent, when using Bayesian evaluation to decide on acceptance rates.
arXiv Detail & Related papers (2024-07-17T20:59:54Z) - Estimating Treatment Effects under Recommender Interference: A Structured Neural Networks Approach [13.208141830901845]
We show that the standard difference-in-means estimator can lead to biased estimates due to recommender interference.
We propose a "recommender choice model" that describes which item gets exposed from a pool containing both treated and control items.
We show that the proposed estimator yields results comparable to the benchmark, whereas the standard difference-in-means estimator can exhibit significant bias and even produce reversed signs.
arXiv Detail & Related papers (2024-06-20T14:53:26Z) - On Faithfulness and Coherence of Language Explanations for
Recommendation Systems [8.143715142450876]
This work probes state-of-the-art models and their review generation component.
We show that the generated explanations are brittle and need further evaluation before being taken as literal rationales for the estimated ratings.
arXiv Detail & Related papers (2022-09-12T17:00:31Z) - Tensor-based Collaborative Filtering With Smooth Ratings Scale [0.0]
We introduce the ratings' similarity matrix which represents the dependency between different values of ratings on the population level.
It is possible to improve the quality of proposed recommendations by off-setting the effect of either shifted down or shifted up users' rates.
arXiv Detail & Related papers (2022-05-10T17:55:25Z) - Cross Pairwise Ranking for Unbiased Item Recommendation [57.71258289870123]
We develop a new learning paradigm named Cross Pairwise Ranking (CPR)
CPR achieves unbiased recommendation without knowing the exposure mechanism.
We prove in theory that this way offsets the influence of user/item propensity on the learning.
arXiv Detail & Related papers (2022-04-26T09:20:27Z) - Spatio-Temporal Graph Representation Learning for Fraudster Group
Detection [50.779498955162644]
Companies may hire fraudster groups to write fake reviews to either demote competitors or promote their own businesses.
To detect such groups, a common model is to represent fraudster groups' static networks.
We propose to first capitalize on the effectiveness of the HIN-RNN in both reviewers' representation learning.
arXiv Detail & Related papers (2022-01-07T08:01:38Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - ScoreGAN: A Fraud Review Detector based on Multi Task Learning of
Regulated GAN with Data Augmentation [50.779498955162644]
We propose ScoreGAN for fraud review detection that makes use of both review text and review rating scores in the generation and detection process.
Results show that the proposed framework outperformed the existing state-of-the-art framework, namely FakeGAN, in terms of AP by 7%, and 5% on the Yelp and TripAdvisor datasets.
arXiv Detail & Related papers (2020-06-11T16:15:06Z) - A Unified Dual-view Model for Review Summarization and Sentiment
Classification with Inconsistency Loss [51.448615489097236]
Acquiring accurate summarization and sentiment from user reviews is an essential component of modern e-commerce platforms.
We propose a novel dual-view model that jointly improves the performance of these two tasks.
Experiment results on four real-world datasets from different domains demonstrate the effectiveness of our model.
arXiv Detail & Related papers (2020-06-02T13:34:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.