Detection of news written by the ChatGPT through authorship attribution
performed by a Bidirectional LSTM model
- URL: http://arxiv.org/abs/2310.16685v1
- Date: Wed, 25 Oct 2023 14:48:58 GMT
- Title: Detection of news written by the ChatGPT through authorship attribution
performed by a Bidirectional LSTM model
- Authors: Amanda Ferrari Iaquinta, Gustavo Voltani von Atzingen
- Abstract summary: This research centers around a particular situation, when the ChatGPT is used to produce news that will be consumed by the population.
It aims to build an artificial intelligence model capable of performing authorship attribution on news articles, identifying the ones written by the ChatGPT.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The large language based-model chatbot ChatGPT gained a lot of popularity
since its launch and has been used in a wide range of situations. This research
centers around a particular situation, when the ChatGPT is used to produce news
that will be consumed by the population, causing the facilitation in the
production of fake news, spread of misinformation and lack of trust in news
sources. Aware of these problems, this research aims to build an artificial
intelligence model capable of performing authorship attribution on news
articles, identifying the ones written by the ChatGPT. To achieve this goal, a
dataset containing equal amounts of human and ChatGPT written news was
assembled and different natural processing language techniques were used to
extract features from it that were used to train, validate and test three
models built with different techniques. The best performance was produced by
the Bidirectional Long Short Term Memory (LSTM) Neural Network model, achiving
91.57\% accuracy when tested against the data from the testing set.
Related papers
- FakeGPT: Fake News Generation, Explanation and Detection of Large Language Models [18.543917359268345]
ChatGPT has gained significant attention due to its exceptional natural language processing capabilities.
We employ four prompt methods to generate fake news samples and prove the high quality of these samples through both self-assessment and human evaluation.
We examine ChatGPT's capacity to identify fake news and propose a reason-aware prompt method to improve its performance.
arXiv Detail & Related papers (2023-10-08T07:01:07Z) - Tackling Fake News in Bengali: Unraveling the Impact of Summarization vs. Augmentation on Pre-trained Language Models [0.0]
We propose a methodology consisting of four distinct approaches to classify fake news articles in Bengali.
Our approach includes translating English news articles and using augmentation techniques to curb the deficit of fake news articles.
We show the effectiveness of summarization and augmentation in the case of Bengali fake news detection.
arXiv Detail & Related papers (2023-07-13T14:50:55Z) - Implementing BERT and fine-tuned RobertA to detect AI generated news by
ChatGPT [0.7130985926640657]
This study shows that neural networks can be used to identify bogus news AI generation news created by ChatGPT.
The RobertA and BERT models' excellent performance indicates that these models can play a critical role in the fight against misinformation.
arXiv Detail & Related papers (2023-06-09T17:53:19Z) - Enhancing Chat Language Models by Scaling High-quality Instructional
Conversations [91.98516412612739]
We first provide a systematically designed, diverse, informative, large-scale dataset of instructional conversations, UltraChat.
Our objective is to capture the breadth of interactions that a human might have with an AI assistant.
We fine-tune a LLaMA model to create a powerful conversational model, UltraLLaMA.
arXiv Detail & Related papers (2023-05-23T16:49:14Z) - GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content [27.901155229342375]
We present a novel approach for detecting ChatGPT-generated vs. human-written text using language models.
Our models achieved remarkable results, with an accuracy of over 97% on the test dataset, as evaluated through various metrics.
arXiv Detail & Related papers (2023-05-13T17:12:11Z) - ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large
Language Models in Multilingual Learning [70.57126720079971]
Large language models (LLMs) have emerged as the most important breakthroughs in natural language processing (NLP)
This paper evaluates ChatGPT on 7 different tasks, covering 37 diverse languages with high, medium, low, and extremely low resources.
Compared to the performance of previous models, our extensive experimental results demonstrate a worse performance of ChatGPT for different NLP tasks and languages.
arXiv Detail & Related papers (2023-04-12T05:08:52Z) - To ChatGPT, or not to ChatGPT: That is the question! [78.407861566006]
This study provides a comprehensive and contemporary assessment of the most recent techniques in ChatGPT detection.
We have curated a benchmark dataset consisting of prompts from ChatGPT and humans, including diverse questions from medical, open Q&A, and finance domains.
Our evaluation results demonstrate that none of the existing methods can effectively detect ChatGPT-generated content.
arXiv Detail & Related papers (2023-04-04T03:04:28Z) - How would Stance Detection Techniques Evolve after the Launch of ChatGPT? [5.756359016880821]
A new pre-trained language model chatGPT was launched on Nov 30, 2022.
ChatGPT can achieve SOTA or similar performance for commonly used datasets including SemEval-2016 and P-Stance.
ChatGPT has the potential to be the best AI model for stance detection tasks in NLP.
arXiv Detail & Related papers (2022-12-30T05:03:15Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - Artificial Text Detection via Examining the Topology of Attention Maps [58.46367297712477]
We propose three novel types of interpretable topological features for this task based on Topological Data Analysis (TDA)
We empirically show that the features derived from the BERT model outperform count- and neural-based baselines up to 10% on three common datasets.
The probing analysis of the features reveals their sensitivity to the surface and syntactic properties.
arXiv Detail & Related papers (2021-09-10T12:13:45Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.