Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains
- URL: http://arxiv.org/abs/2504.06917v1
- Date: Wed, 09 Apr 2025 14:23:54 GMT
- Title: Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains
- Authors: Ming Liu, Massimo Poesio,
- Abstract summary: We use large language models to generate datasets to train fake review detectors.<n>Our approach was used to generate fake reviews in different domains (book reviews, restaurant reviews, and hotel reviews) and different languages (English and Chinese)<n>The accuracy of our fake review detection model can be improved by 0.3 percentage points on DeRev TEST, 10.9 percentage points on Amazon TEST, 8.3 percentage points on Yelp TEST and 7.2 percentage points on DianPing TEST.
- Score: 10.064399146272228
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the growth of the Internet, buying habits have changed, and customers have become more dependent on the online opinions of other customers to guide their purchases. Identifying fake reviews thus became an important area for Natural Language Processing (NLP) research. However, developing high-performance NLP models depends on the availability of large amounts of training data, which are often not available for low-resource languages or domains. In this research, we used large language models to generate datasets to train fake review detectors. Our approach was used to generate fake reviews in different domains (book reviews, restaurant reviews, and hotel reviews) and different languages (English and Chinese). Our results demonstrate that our data augmentation techniques result in improved performance at fake review detection for all domains and languages. The accuracy of our fake review detection model can be improved by 0.3 percentage points on DeRev TEST, 10.9 percentage points on Amazon TEST, 8.3 percentage points on Yelp TEST and 7.2 percentage points on DianPing TEST using the augmented datasets.
Related papers
- LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews [74.87393214734114]
This work introduces LazyReview, a dataset of peer-review sentences annotated with fine-grained lazy thinking categories.
Large Language Models (LLMs) struggle to detect these instances in a zero-shot setting.
instruction-based fine-tuning on our dataset significantly boosts performance by 10-20 performance points.
arXiv Detail & Related papers (2025-04-15T10:07:33Z) - Improving Model Factuality with Fine-grained Critique-based Evaluator [47.36934130646514]
We train a factuality evaluator, FenCE, that provides LM generators with claim-level factuality feedback.<n>We present a framework that leverages FenCE to improve the factuality of LM generators by constructing training data.<n>Experiments show that our data augmentation methods improve the evaluator's accuracy by 2.9% on LLM-AggreFact.
arXiv Detail & Related papers (2024-10-24T01:41:02Z) - Enhanced Review Detection and Recognition: A Platform-Agnostic Approach with Application to Online Commerce [0.46040036610482665]
We present a machine learning methodology for review detection and extraction.
We demonstrate that it generalises for use across websites that were not contained in the training data.
This method promises to drive applications for automatic detection and evaluation of reviews, regardless of their source.
arXiv Detail & Related papers (2024-05-09T00:32:22Z) - MAiDE-up: Multilingual Deception Detection of GPT-generated Hotel Reviews [29.174548645439756]
We make publicly available the MAiDE-up dataset, consisting of 10,000 real and 10,000 AI-generated fake hotel reviews.
We conduct extensive linguistic analyses to compare the AI fake hotel reviews to real hotel reviews.
We find that these dimensions influence how well we can detect AI-generated fake reviews.
arXiv Detail & Related papers (2024-04-19T15:08:06Z) - AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant
Reviews and Images on Social Media [57.70351255180495]
AiGen-FoodReview is a dataset of 20,144 restaurant review-image pairs divided into authentic and machine-generated.
We explore unimodal and multimodal detection models, achieving 99.80% multimodal accuracy with FLAVA.
The paper contributes by open-sourcing the dataset and releasing fake review detectors, recommending its use in unimodal and multimodal fake review detection tasks, and evaluating linguistic and visual features in synthetic versus authentic data.
arXiv Detail & Related papers (2024-01-16T20:57:36Z) - Multilingual and Multi-topical Benchmark of Fine-tuned Language models and Large Language Models for Check-Worthy Claim Detection [1.4779899760345434]
This study compares the performance of (1) fine-tuned language models and (2) large language models on the task of check-worthy claim detection.
We composed a multilingual and multi-topical dataset comprising texts of various sources and styles.
arXiv Detail & Related papers (2023-11-10T15:36:35Z) - Generating Benchmarks for Factuality Evaluation of Language Models [61.69950787311278]
We propose FACTOR: Factual Assessment via Corpus TransfORmation, a scalable approach for evaluating LM factuality.
FACTOR automatically transforms a factual corpus of interest into a benchmark evaluating an LM's propensity to generate true facts from the corpus vs. similar but incorrect statements.
We show that: (i) our benchmark scores increase with model size and improve when the LM is augmented with retrieval; (ii) benchmark score and perplexity do not always agree on model ranking; (iii) when perplexity and benchmark score disagree, the latter better reflects factuality in open-ended generation.
arXiv Detail & Related papers (2023-07-13T17:14:38Z) - Bengali Fake Review Detection using Semi-supervised Generative
Adversarial Networks [0.0]
This paper investigates the potential of semi-supervised Generative Adversarial Networks (GANs) to fine-tune pretrained language models.
We have demonstrated that the proposed semi-supervised GAN-LM architecture is a viable solution in classifying Bengali fake reviews.
arXiv Detail & Related papers (2023-04-05T20:40:09Z) - Evaluating the Effectiveness of Pre-trained Language Models in
Predicting the Helpfulness of Online Product Reviews [0.21485350418225244]
We compare the use of RoBERTa and XLM-R language models to predict the helpfulness of online product reviews.
We employ the Amazon review dataset for our experiments.
arXiv Detail & Related papers (2023-02-19T18:22:59Z) - Online Fake Review Detection Using Supervised Machine Learning And BERT
Model [0.0]
We propose to use BERT (Bidirectional Representation from Transformers) model to extract word embeddings from texts (i.e. reviews)
The results indicate that the SVM classifiers outperform the others in terms of accuracy and f1-score with an accuracy of 87.81%.
arXiv Detail & Related papers (2023-01-09T09:40:56Z) - Retrieval-based Disentangled Representation Learning with Natural
Language Supervision [61.75109410513864]
We present Vocabulary Disentangled Retrieval (VDR), a retrieval-based framework that harnesses natural language as proxies of the underlying data variation to drive disentangled representation learning.
Our approach employ a bi-encoder model to represent both data and natural language in a vocabulary space, enabling the model to distinguish intrinsic dimensions that capture characteristics within data through its natural language counterpart, thus disentanglement.
arXiv Detail & Related papers (2022-12-15T10:20:42Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z) - ScoreGAN: A Fraud Review Detector based on Multi Task Learning of
Regulated GAN with Data Augmentation [50.779498955162644]
We propose ScoreGAN for fraud review detection that makes use of both review text and review rating scores in the generation and detection process.
Results show that the proposed framework outperformed the existing state-of-the-art framework, namely FakeGAN, in terms of AP by 7%, and 5% on the Yelp and TripAdvisor datasets.
arXiv Detail & Related papers (2020-06-11T16:15:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.