Stylometric Detection of AI-Generated Text in Twitter Timelines
- URL: http://arxiv.org/abs/2303.03697v1
- Date: Tue, 7 Mar 2023 07:26:09 GMT
- Title: Stylometric Detection of AI-Generated Text in Twitter Timelines
- Authors: Tharindu Kumarage, Joshua Garland, Amrita Bhattacharjee, Kirill
Trapeznikov, Scott Ruston, Huan Liu
- Abstract summary: Social media platforms like Twitter are highly susceptible to AI-generated misinformation.
A potential threat scenario is when an adversary hijacks a credible user account and incorporates a natural language generator to generate misinformation.
We present a novel algorithm using stylometric signals to aid detecting AI-generated tweets.
- Score: 17.62006063931326
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in pre-trained language models have enabled convenient
methods for generating human-like text at a large scale. Though these
generation capabilities hold great potential for breakthrough applications, it
can also be a tool for an adversary to generate misinformation. In particular,
social media platforms like Twitter are highly susceptible to AI-generated
misinformation. A potential threat scenario is when an adversary hijacks a
credible user account and incorporates a natural language generator to generate
misinformation. Such threats necessitate automated detectors for AI-generated
tweets in a given user's Twitter timeline. However, tweets are inherently
short, thus making it difficult for current state-of-the-art pre-trained
language model-based detectors to accurately detect at what point the AI starts
to generate tweets in a given Twitter timeline. In this paper, we present a
novel algorithm using stylometric signals to aid detecting AI-generated tweets.
We propose models corresponding to quantifying stylistic changes in human and
AI tweets in two related tasks: Task 1 - discriminate between human and
AI-generated tweets, and Task 2 - detect if and when an AI starts to generate
tweets in a given Twitter timeline. Our extensive experiments demonstrate that
the stylometric features are effective in augmenting the state-of-the-art
AI-generated text detectors.
Related papers
- AI incidents and 'networked trouble': The case for a research agenda [0.0]
I argue for a research agenda focused on AI incidents and how they are constructed in online environments.
I take up the example of an AI incident from September 2020, when a Twitter user created a 'horrible experiment' to demonstrate the racist bias of Twitter's algorithm for cropping images.
I argue that AI incidents like this are a significant means for participating in AI systems that require further research.
arXiv Detail & Related papers (2024-01-07T11:23:13Z) - AI Content Self-Detection for Transformer-based Large Language Models [0.0]
This paper introduces the idea of direct origin detection and evaluates whether generative AI systems can recognize their output and distinguish it from human-written texts.
Google's Bard model exhibits the largest capability of self-detection with an accuracy of 94%, followed by OpenAI's ChatGPT with 83%.
arXiv Detail & Related papers (2023-12-28T10:08:57Z) - Towards Possibilities & Impossibilities of AI-generated Text Detection:
A Survey [97.33926242130732]
Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses.
Despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs.
To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text.
arXiv Detail & Related papers (2023-10-23T18:11:32Z) - BotArtist: Generic approach for bot detection in Twitter via semi-automatic machine learning pipeline [47.61306219245444]
Twitter has become a target for bots and fake accounts, resulting in the spread of false information and manipulation.
This paper introduces a semi-automatic machine learning pipeline (SAMLP) designed to address the challenges correlated with machine learning model development.
We develop a comprehensive bot detection model named BotArtist, based on user profile features.
arXiv Detail & Related papers (2023-05-31T09:12:35Z) - Paraphrasing evades detectors of AI-generated text, but retrieval is an
effective defense [56.077252790310176]
We present a paraphrase generation model (DIPPER) that can paraphrase paragraphs, condition on surrounding context, and control lexical diversity and content reordering.
Using DIPPER to paraphrase text generated by three large language models (including GPT3.5-davinci-003) successfully evades several detectors, including watermarking.
We introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.
arXiv Detail & Related papers (2023-03-23T16:29:27Z) - Can AI-Generated Text be Reliably Detected? [54.670136179857344]
Unregulated use of LLMs can potentially lead to malicious consequences such as plagiarism, generating fake news, spamming, etc.
Recent works attempt to tackle this problem either using certain model signatures present in the generated text outputs or by applying watermarking techniques.
In this paper, we show that these detectors are not reliable in practical scenarios.
arXiv Detail & Related papers (2023-03-17T17:53:19Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - BotSpot: Deep Learning Classification of Bot Accounts within Twitter [2.099922236065961]
The openness feature of Twitter allows programs to generate and control Twitter accounts automatically via the Twitter API.
These accounts, which are known as bots, can automatically perform actions such as tweeting, re-tweeting, following, unfollowing, or direct messaging other accounts.
We introduce a novel bot detection approach using deep learning, with the Multi-layer Perceptron Neural Networks and nine features of a bot account.
arXiv Detail & Related papers (2021-09-08T15:17:10Z) - The Threat of Offensive AI to Organizations [52.011307264694665]
This survey explores the threat of offensive AI on organizations.
First, we discuss how AI changes the adversary's methods, strategies, goals, and overall attack model.
Then, through a literature review, we identify 33 offensive AI capabilities which adversaries can use to enhance their attacks.
arXiv Detail & Related papers (2021-06-30T01:03:28Z) - TweepFake: about Detecting Deepfake Tweets [3.3482093430607254]
Deep neural models can generate coherent, non-trivial and human-like text samples.
Social bots can write plausible deepfake messages, hoping to contaminate public debate.
We collect the first dataset of real deepfake tweets, TweepFake.
arXiv Detail & Related papers (2020-07-31T19:01:13Z) - Twitter Bot Detection Using Bidirectional Long Short-term Memory Neural
Networks and Word Embeddings [6.09170287691728]
This paper develops a recurrent neural model with word embeddings to distinguish Twitter bots from human accounts.
Experiments show that our approach can achieve competitive performance compared with existing state-of-the-art bot detection systems.
arXiv Detail & Related papers (2020-02-03T17:07:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.