Bengali Fake Review Detection using Semi-supervised Generative
Adversarial Networks
- URL: http://arxiv.org/abs/2304.02739v1
- Date: Wed, 5 Apr 2023 20:40:09 GMT
- Title: Bengali Fake Review Detection using Semi-supervised Generative
Adversarial Networks
- Authors: Md. Tanvir Rouf Shawon, G. M. Shahariar, Faisal Muhammad Shah,
Mohammad Shafiul Alam and Md. Shahriar Mahbub
- Abstract summary: This paper investigates the potential of semi-supervised Generative Adversarial Networks (GANs) to fine-tune pretrained language models.
We have demonstrated that the proposed semi-supervised GAN-LM architecture is a viable solution in classifying Bengali fake reviews.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper investigates the potential of semi-supervised Generative
Adversarial Networks (GANs) to fine-tune pretrained language models in order to
classify Bengali fake reviews from real reviews with a few annotated data. With
the rise of social media and e-commerce, the ability to detect fake or
deceptive reviews is becoming increasingly important in order to protect
consumers from being misled by false information. Any machine learning model
will have trouble identifying a fake review, especially for a low resource
language like Bengali. We have demonstrated that the proposed semi-supervised
GAN-LM architecture (generative adversarial network on top of a pretrained
language model) is a viable solution in classifying Bengali fake reviews as the
experimental results suggest that even with only 1024 annotated samples,
BanglaBERT with semi-supervised GAN (SSGAN) achieved an accuracy of 83.59% and
a f1-score of 84.89% outperforming other pretrained language models -
BanglaBERT generator, Bangla BERT Base and Bangla-Electra by almost 3%, 4% and
10% respectively in terms of accuracy. The experiments were conducted on a
manually labeled food review dataset consisting of total 6014 real and fake
reviews collected from various social media groups. Researchers that are
experiencing difficulty recognizing not just fake reviews but other
classification issues owing to a lack of labeled data may find a solution in
our proposed methodology.
Related papers
- Breaking the Fake News Barrier: Deep Learning Approaches in Bangla Language [0.0]
This ponder presents a strategy that utilizes a profound learning innovation, particularly the Gated Repetitive Unit (GRU) to recognize fake news within the Bangla dialect.
The strategy of our proposed work incorporates intensive information preprocessing, which includes tlemmaization, tokenization, and tending to course awkward nature by oversampling.
The performance of the model is investigated by reliable metrics like precision, recall, F1 score, and accuracy.
arXiv Detail & Related papers (2025-01-30T21:41:26Z) - What Matters in Explanations: Towards Explainable Fake Review Detection Focusing on Transformers [45.55363754551388]
Customers' reviews and feedback play crucial role on e-commerce platforms like Amazon, Zalando, and eBay.
There is a prevailing concern that sellers often post fake or spam reviews to deceive potential customers and manipulate their opinions about a product.
We propose an explainable framework for detecting fake reviews with high precision in identifying fraudulent content with explanations.
arXiv Detail & Related papers (2024-07-24T13:26:02Z) - Bengali Fake Reviews: A Benchmark Dataset and Detection System [0.0]
This paper introduces the Bengali Fake Review Detection (BFRD) dataset, the first publicly available dataset for identifying fake reviews in Bengali.
The dataset consists of 7710 non-fake and 1339 fake food-related reviews collected from social media posts.
To convert non-Bengali words in a review, a unique pipeline has been proposed that translates English words to their corresponding Bengali meaning and also back transliterates Romanized Bengali to Bengali.
arXiv Detail & Related papers (2023-08-03T18:49:45Z) - Tackling Fake News in Bengali: Unraveling the Impact of Summarization vs. Augmentation on Pre-trained Language Models [0.0]
We propose a methodology consisting of four distinct approaches to classify fake news articles in Bengali.
Our approach includes translating English news articles and using augmentation techniques to curb the deficit of fake news articles.
We show the effectiveness of summarization and augmentation in the case of Bengali fake news detection.
arXiv Detail & Related papers (2023-07-13T14:50:55Z) - BanglaBook: A Large-scale Bangla Dataset for Sentiment Analysis from
Book Reviews [1.869097450593631]
We present a large-scale dataset of Bangla book reviews consisting of 158,065 samples classified into three broad categories: positive, negative, and neutral.
We employ a range of machine learning models to establish baselines including SVM, LSTM, and Bangla-BERT.
Our findings demonstrate a substantial performance advantage of pre-trained models over models that rely on manually crafted features.
arXiv Detail & Related papers (2023-05-11T06:27:38Z) - Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models.
We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - Combat AI With AI: Counteract Machine-Generated Fake Restaurant Reviews
on Social Media [77.34726150561087]
We propose to leverage the high-quality elite Yelp reviews to generate fake reviews from the OpenAI GPT review creator.
We apply the model to predict non-elite reviews and identify the patterns across several dimensions.
We show that social media platforms are continuously challenged by machine-generated fake reviews.
arXiv Detail & Related papers (2023-02-10T19:40:10Z) - Online Fake Review Detection Using Supervised Machine Learning And BERT
Model [0.0]
We propose to use BERT (Bidirectional Representation from Transformers) model to extract word embeddings from texts (i.e. reviews)
The results indicate that the SVM classifiers outperform the others in terms of accuracy and f1-score with an accuracy of 87.81%.
arXiv Detail & Related papers (2023-01-09T09:40:56Z) - Fake or Genuine? Contextualised Text Representation for Fake Review
Detection [0.4724825031148411]
This paper proposes a new ensemble model that employs transformer architecture to discover the hidden patterns in a sequence of fake reviews and detect them precisely.
The experimental results using semi-real benchmark datasets showed the superiority of the proposed model over state-of-the-art models.
arXiv Detail & Related papers (2021-12-29T00:54:47Z) - Explain, Edit, and Understand: Rethinking User Study Design for
Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews.
We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z) - ScoreGAN: A Fraud Review Detector based on Multi Task Learning of
Regulated GAN with Data Augmentation [50.779498955162644]
We propose ScoreGAN for fraud review detection that makes use of both review text and review rating scores in the generation and detection process.
Results show that the proposed framework outperformed the existing state-of-the-art framework, namely FakeGAN, in terms of AP by 7%, and 5% on the Yelp and TripAdvisor datasets.
arXiv Detail & Related papers (2020-06-11T16:15:06Z) - Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof.
At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.