MAiDE-up: Multilingual Deception Detection of GPT-generated Hotel Reviews
- URL: http://arxiv.org/abs/2404.12938v2
- Date: Wed, 19 Jun 2024 03:34:42 GMT
- Title: MAiDE-up: Multilingual Deception Detection of GPT-generated Hotel Reviews
- Authors: Oana Ignat, Xiaomeng Xu, Rada Mihalcea,
- Abstract summary: We make publicly available the MAiDE-up dataset, consisting of 10,000 real and 10,000 AI-generated fake hotel reviews.
We conduct extensive linguistic analyses to compare the AI fake hotel reviews to real hotel reviews.
We find that these dimensions influence how well we can detect AI-generated fake reviews.
- Score: 29.174548645439756
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deceptive reviews are becoming increasingly common, especially given the increase in performance and the prevalence of LLMs. While work to date has addressed the development of models to differentiate between truthful and deceptive human reviews, much less is known about the distinction between real reviews and AI-authored fake reviews. Moreover, most of the research so far has focused primarily on English, with very little work dedicated to other languages. In this paper, we compile and make publicly available the MAiDE-up dataset, consisting of 10,000 real and 10,000 AI-generated fake hotel reviews, balanced across ten languages. Using this dataset, we conduct extensive linguistic analyses to (1) compare the AI fake hotel reviews to real hotel reviews, and (2) identify the factors that influence the deception detection model performance. We explore the effectiveness of several models for deception detection in hotel reviews across three main dimensions: sentiment, location, and language. We find that these dimensions influence how well we can detect AI-generated fake reviews.
Related papers
- LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews [74.87393214734114]
This work introduces LazyReview, a dataset of peer-review sentences annotated with fine-grained lazy thinking categories.
Large Language Models (LLMs) struggle to detect these instances in a zero-shot setting.
instruction-based fine-tuning on our dataset significantly boosts performance by 10-20 performance points.
arXiv Detail & Related papers (2025-04-15T10:07:33Z) - Data Augmentation for Fake Reviews Detection in Multiple Languages and Multiple Domains [10.064399146272228]
We use large language models to generate datasets to train fake review detectors.
Our approach was used to generate fake reviews in different domains (book reviews, restaurant reviews, and hotel reviews) and different languages (English and Chinese)
The accuracy of our fake review detection model can be improved by 0.3 percentage points on DeRev TEST, 10.9 percentage points on Amazon TEST, 8.3 percentage points on Yelp TEST and 7.2 percentage points on DianPing TEST.
arXiv Detail & Related papers (2025-04-09T14:23:54Z) - Who Writes the Review, Human or AI? [0.36498648388765503]
This study proposes a methodology to accurately distinguish AI-generated and human-written book reviews.
Our approach utilizes transfer learning, enabling the model to identify generated text across different topics.
The experimental results demonstrate that it is feasible to detect the original source of text, achieving an accuracy rate of 96.86%.
arXiv Detail & Related papers (2024-05-30T17:38:44Z) - ChatGPT vs Gemini vs LLaMA on Multilingual Sentiment Analysis [0.0]
We constructed nuanced and ambiguous scenarios, we translated them in 10 languages, and we predicted their associated sentiment using popular LLMs.
The results are validated against post-hoc human responses.
This work provides a standardised methodology for automated sentiment analysis evaluation.
arXiv Detail & Related papers (2024-01-25T23:15:45Z) - AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant
Reviews and Images on Social Media [57.70351255180495]
AiGen-FoodReview is a dataset of 20,144 restaurant review-image pairs divided into authentic and machine-generated.
We explore unimodal and multimodal detection models, achieving 99.80% multimodal accuracy with FLAVA.
The paper contributes by open-sourcing the dataset and releasing fake review detectors, recommending its use in unimodal and multimodal fake review detection tasks, and evaluating linguistic and visual features in synthetic versus authentic data.
arXiv Detail & Related papers (2024-01-16T20:57:36Z) - UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset.
Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z) - Cross-Lingual Knowledge Editing in Large Language Models [73.12622532088564]
Knowledge editing has been shown to adapt large language models to new knowledge without retraining from scratch.
It is still unknown the effect of source language editing on a different target language.
We first collect a large-scale cross-lingual synthetic dataset by translating ZsRE from English to Chinese.
arXiv Detail & Related papers (2023-09-16T11:07:52Z) - Evaluating the Effectiveness of Pre-trained Language Models in
Predicting the Helpfulness of Online Product Reviews [0.21485350418225244]
We compare the use of RoBERTa and XLM-R language models to predict the helpfulness of online product reviews.
We employ the Amazon review dataset for our experiments.
arXiv Detail & Related papers (2023-02-19T18:22:59Z) - Combat AI With AI: Counteract Machine-Generated Fake Restaurant Reviews
on Social Media [77.34726150561087]
We propose to leverage the high-quality elite Yelp reviews to generate fake reviews from the OpenAI GPT review creator.
We apply the model to predict non-elite reviews and identify the patterns across several dimensions.
We show that social media platforms are continuously challenged by machine-generated fake reviews.
arXiv Detail & Related papers (2023-02-10T19:40:10Z) - Towards a Fair Comparison and Realistic Design and Evaluation Framework
of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework.
We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models.
We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z) - Fake or Genuine? Contextualised Text Representation for Fake Review
Detection [0.4724825031148411]
This paper proposes a new ensemble model that employs transformer architecture to discover the hidden patterns in a sequence of fake reviews and detect them precisely.
The experimental results using semi-real benchmark datasets showed the superiority of the proposed model over state-of-the-art models.
arXiv Detail & Related papers (2021-12-29T00:54:47Z) - SIFN: A Sentiment-aware Interactive Fusion Network for Review-based Item
Recommendation [48.1799451277808]
We propose a Sentiment-aware Interactive Fusion Network (SIFN) for review-based item recommendation.
We first encode user/item reviews via BERT and propose a light-weighted sentiment learner to extract semantic features of each review.
Then, we propose a sentiment prediction task that guides the sentiment learner to extract sentiment-aware features via explicit sentiment labels.
arXiv Detail & Related papers (2021-08-18T08:04:38Z) - Fake Reviews Detection through Analysis of Linguistic Features [1.609940380983903]
This paper explores a natural language processing approach to identify fake reviews.
We study 15 linguistic features for distinguishing fake and trustworthy online reviews.
We were able to discriminate fake from real reviews with high accuracy using these linguistic features.
arXiv Detail & Related papers (2020-10-08T21:16:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.