Evaluating the Effectiveness of Pre-trained Language Models in
Predicting the Helpfulness of Online Product Reviews
- URL: http://arxiv.org/abs/2302.10199v1
- Date: Sun, 19 Feb 2023 18:22:59 GMT
- Title: Evaluating the Effectiveness of Pre-trained Language Models in
Predicting the Helpfulness of Online Product Reviews
- Authors: Ali Boluki, Javad Pourmostafa Roshan Sharami, Dimitar Shterionov
- Abstract summary: We compare the use of RoBERTa and XLM-R language models to predict the helpfulness of online product reviews.
We employ the Amazon review dataset for our experiments.
- Score: 0.21485350418225244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Businesses and customers can gain valuable information from product reviews.
The sheer number of reviews often necessitates ranking them based on their
potential helpfulness. However, only a few reviews ever receive any helpfulness
votes on online marketplaces. Sorting all reviews based on the few existing
votes can cause helpful reviews to go unnoticed because of the limited
attention span of readers. The problem of review helpfulness prediction is even
more important for higher review volumes, and newly written reviews or launched
products. In this work we compare the use of RoBERTa and XLM-R language models
to predict the helpfulness of online product reviews. The contributions of our
work in relation to literature include extensively investigating the efficacy
of state-of-the-art language models -- both monolingual and multilingual --
against a robust baseline, taking ranking metrics into account when assessing
these approaches, and assessing multilingual models for the first time. We
employ the Amazon review dataset for our experiments. According to our study on
several product categories, multilingual and monolingual pre-trained language
models outperform the baseline that utilizes random forest with handcrafted
features as much as 23% in RMSE. Pre-trained language models reduce the need
for complex text feature engineering. However, our results suggest that
pre-trained multilingual models may not be used for fine-tuning only one
language. We assess the performance of language models with and without
additional features. Our results show that including additional features like
product rating by the reviewer can further help the predictive methods.
Related papers
- Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs [36.30321941154582]
Hercule is a cross-lingual evaluation model that learns to assign scores to responses based on easily available reference answers in English.
This study is the first comprehensive examination of cross-lingual evaluation using LLMs, presenting a scalable and effective approach for multilingual assessment.
arXiv Detail & Related papers (2024-10-17T09:45:32Z) - Take the Hint: Improving Arabic Diacritization with
Partially-Diacritized Text [4.863310073296471]
We propose 2SDiac, a multi-source model that can effectively support optional diacritics in input to inform all predictions.
We also introduce Guided Learning, a training scheme to leverage given diacritics in input with different levels of random masking.
arXiv Detail & Related papers (2023-06-06T10:18:17Z) - ReadMe++: Benchmarking Multilingual Language Models for Multi-Domain Readability Assessment [12.704628912075218]
This paper introduces ReadMe++, a multilingual multi-domain dataset with human annotations of 9757 sentences in Arabic, English, French, Hindi, and Russian.
Using ReadMe++, we benchmark multilingual and monolingual language models in the supervised, unsupervised, and few-shot prompting settings.
Our experiments reveal exciting results of superior domain generalization and enhanced cross-lingual transfer capabilities by models trained on ReadMe++.
arXiv Detail & Related papers (2023-05-23T18:37:30Z) - Training Language Models with Language Feedback at Scale [50.70091340506957]
We introduce learning from Language Feedback (ILF), a new approach that utilizes more informative language feedback.
ILF consists of three steps that are applied iteratively: first, conditioning the language model on the input, an initial LM output, and feedback to generate refinements.
We show theoretically that ILF can be viewed as Bayesian Inference, similar to Reinforcement Learning from human feedback.
arXiv Detail & Related papers (2023-03-28T17:04:15Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - Training Language Models with Natural Language Feedback [51.36137482891037]
We learn from language feedback on model outputs using a three-step learning algorithm.
In synthetic experiments, we first evaluate whether language models accurately incorporate feedback to produce refinements.
Using only 100 samples of human-written feedback, our learning algorithm finetunes a GPT-3 model to roughly human-level summarization.
arXiv Detail & Related papers (2022-04-29T15:06:58Z) - Language Models are Few-shot Multilingual Learners [66.11011385895195]
We evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages.
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones.
arXiv Detail & Related papers (2021-09-16T03:08:22Z) - Transfer Learning for Mining Feature Requests and Bug Reports from
Tweets and App Store Reviews [4.446419663487345]
Existing approaches fail to detect feature requests and bug reports with high Recall and acceptable Precision.
We train both monolingual and multilingual BERT models and compare the performance with state-of-the-art methods.
arXiv Detail & Related papers (2021-08-02T06:51:13Z) - Improving Cross-Lingual Reading Comprehension with Self-Training [62.73937175625953]
Current state-of-the-art models even surpass human performance on several benchmarks.
Previous works have revealed the abilities of pre-trained multilingual models for zero-shot cross-lingual reading comprehension.
This paper further utilized unlabeled data to improve the performance.
arXiv Detail & Related papers (2021-05-08T08:04:30Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.