BotSSCL: Social Bot Detection with Self-Supervised Contrastive Learning
- URL: http://arxiv.org/abs/2402.03740v1
- Date: Tue, 6 Feb 2024 06:13:13 GMT
- Title: BotSSCL: Social Bot Detection with Self-Supervised Contrastive Learning
- Authors: Mohammad Majid Akhtar, Navid Shadman Bhuiyan, Rahat Masood, Muhammad
Ikram, Salil S. Kanhere
- Abstract summary: We propose a novel framework for social Bot detection with Self-Supervised Contrastive Learning (BotSSCL)
BotSSCL uses contrastive learning to distinguish between social bots and humans in the embedding space to improve linear separability.
We demonstrate BotSSCL's robustness against adversarial attempts to manipulate bot accounts to evade detection.
- Score: 6.317191658158437
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The detection of automated accounts, also known as "social bots", has been an
increasingly important concern for online social networks (OSNs). While several
methods have been proposed for detecting social bots, significant research gaps
remain. First, current models exhibit limitations in detecting sophisticated
bots that aim to mimic genuine OSN users. Second, these methods often rely on
simplistic profile features, which are susceptible to manipulation. In addition
to their vulnerability to adversarial manipulations, these models lack
generalizability, resulting in subpar performance when trained on one dataset
and tested on another.
To address these challenges, we propose a novel framework for social Bot
detection with Self-Supervised Contrastive Learning (BotSSCL). Our framework
leverages contrastive learning to distinguish between social bots and humans in
the embedding space to improve linear separability. The high-level
representations derived by BotSSCL enhance its resilience to variations in data
distribution and ensure generalizability. We evaluate BotSSCL's robustness
against adversarial attempts to manipulate bot accounts to evade detection.
Experiments on two datasets featuring sophisticated bots demonstrate that
BotSSCL outperforms other supervised, unsupervised, and self-supervised
baseline methods. We achieve approx. 6% and approx. 8% higher (F1) performance
than SOTA on both datasets. In addition, BotSSCL also achieves 67% F1 when
trained on one dataset and tested with another, demonstrating its
generalizability. Lastly, BotSSCL increases adversarial complexity and only
allows 4% success to the adversary in evading detection.
Related papers
- Incremental Self-training for Semi-supervised Learning [56.57057576885672]
IST is simple yet effective and fits existing self-training-based semi-supervised learning methods.
We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed.
arXiv Detail & Related papers (2024-04-14T05:02:00Z) - What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection [48.572932773403274]
We investigate the opportunities and risks of large language models in social bot detection.
We propose a mixture-of-heterogeneous-experts framework to divide and conquer diverse user information modalities.
Experiments show that instruction tuning on 1,000 annotated examples produces specialized LLMs that outperform state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-01T06:21:19Z) - My Brother Helps Me: Node Injection Based Adversarial Attack on Social Bot Detection [69.99192868521564]
Social platforms such as Twitter are under siege from a multitude of fraudulent users.
Due to the structure of social networks, the majority of methods are based on the graph neural network(GNN), which is susceptible to attacks.
We propose a node injection-based adversarial attack method designed to deceive bot detection models.
arXiv Detail & Related papers (2023-10-11T03:09:48Z) - BotTriNet: A Unified and Efficient Embedding for Social Bots Detection
via Metric Learning [3.9026461169566673]
We propose BOTTRINET, a unified embedding framework that leverages the textual content posted by accounts to detect bots.
The BOTTRINET framework produces word, sentence, and account embeddings, which we evaluate on a real-world dataset.
Our approach achieves state-of-the-art performance on two content-intensive bot sets, with an average accuracy of 98.34% and f1score of 97.99%.
arXiv Detail & Related papers (2023-04-06T15:28:58Z) - BotShape: A Novel Social Bots Detection Approach via Behavioral Patterns [4.386183132284449]
Based on a real-world data set, we construct behavioral sequences from raw event logs.
We observe differences between bots and genuine users and similar patterns among bot accounts.
We present a novel social bot detection system BotShape, to automatically catch behavioral sequences and characteristics.
arXiv Detail & Related papers (2023-03-17T19:03:06Z) - Simplistic Collection and Labeling Practices Limit the Utility of
Benchmark Datasets for Twitter Bot Detection [3.8428576920007083]
We show that high performance is attributable to limitations in dataset collection and labeling rather than sophistication of the tools.
Our findings have important implications for both transparency in sampling and labeling procedures and potential biases in research.
arXiv Detail & Related papers (2023-01-17T17:05:55Z) - BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot
Detection [63.447493500066045]
This work proposes a data driven learning model for the synthesis of keystroke biometric data.
The proposed method is compared with two statistical approaches based on Universal and User-dependent models.
Our experimental framework considers a dataset with 136 million keystroke events from 168 thousand subjects.
arXiv Detail & Related papers (2022-07-27T09:26:15Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z) - Detection of Novel Social Bots by Ensembles of Specialized Classifiers [60.63582690037839]
Malicious actors create inauthentic social media accounts controlled in part by algorithms, known as social bots, to disseminate misinformation and agitate online discussion.
We show that different types of bots are characterized by different behavioral features.
We propose a new supervised learning method that trains classifiers specialized for each class of bots and combines their decisions through the maximum rule.
arXiv Detail & Related papers (2020-06-11T22:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.