FP-Inconsistent: Detecting Evasive Bots using Browser Fingerprint Inconsistencies
- URL: http://arxiv.org/abs/2406.07647v2
- Date: Fri, 31 Jan 2025 04:12:57 GMT
- Title: FP-Inconsistent: Detecting Evasive Bots using Browser Fingerprint Inconsistencies
- Authors: Hari Venugopalan, Shaoor Munir, Shuaib Ahmed, Tangbaihe Wang, Samuel T. King, Zubair Shafiq,
- Abstract summary: We conduct the first large-scale evaluation of evasive bots to investigate whether and how altering fingerprints helps bots evade detection.
We find an average evasion rate of 52.93% against DataDome and 44.56% evasion rate against BotD.
Given evasive bots seem to have difficulty in ensuring consistency in their fingerprint attributes, we propose a data-driven approach to discover rules to detect such inconsistencies.
- Score: 13.105329613926623
- License:
- Abstract: As browser fingerprinting is increasingly being used for bot detection, bots have started altering their fingerprints for evasion. We conduct the first large-scale evaluation of evasive bots to investigate whether and how altering fingerprints helps bots evade detection. To systematically investigate evasive bots, we deploy a honey site incorporating two anti-bot services (DataDome and BotD) and solicit bot traffic from 20 different bot services that purport to sell "realistic and undetectable traffic". Across half a million requests from 20 different bot services on our honey site, we find an average evasion rate of 52.93% against DataDome and 44.56% evasion rate against BotD. Our comparison of fingerprint attributes from bot services that evade each anti-bot service individually as well as bot services that evade both shows that bot services indeed alter different browser fingerprint attributes for evasion. Further, our analysis reveals the presence of inconsistent fingerprint attributes in evasive bots. Given evasive bots seem to have difficulty in ensuring consistency in their fingerprint attributes, we propose a data-driven approach to discover rules to detect such inconsistencies across space (two attributes in a given browser fingerprint) and time (a single attribute at two different points in time). These rules, which can be readily deployed by anti-bot services, reduce the evasion rate of evasive bots against DataDome and BotD by 48.11% and 44.95% respectively.
Related papers
- Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards [93.16294577018482]
Arena, the most popular benchmark of this type, ranks models by asking users to select the better response between two randomly selected models.
We show that an attacker can alter the leaderboard (to promote their favorite model or demote competitors) at the cost of roughly a thousand votes.
Our attack consists of two steps: first, we show how an attacker can determine which model was used to generate a given reply with more than $95%$ accuracy; and then, the attacker can use this information to consistently vote against a target model.
arXiv Detail & Related papers (2025-01-13T17:12:38Z) - What is a Social Media Bot? A Global Comparison of Bot and Human Characteristics [5.494111035517598]
Bots tend to use linguistic cues that can be easily automated while humans use cues that require dialogue understanding.
These conclusions are based on a large-scale analysis of social media tweets across 200mil users across 7 events.
arXiv Detail & Related papers (2025-01-01T14:45:43Z) - BOTracle: A framework for Discriminating Bots and Humans [5.3248028128815434]
Bots constitute a significant portion of Internet traffic and are a source of various issues across multiple domains.
We analyze the challenge of bot detection in high-traffic scenarios by analyzing three distinct detection methods.
Our performance metrics, including precision, recall, and AUC, reach 98 percent or higher, surpassing Botcha.
arXiv Detail & Related papers (2024-12-03T08:38:30Z) - Unmasking Social Bots: How Confident Are We? [41.94295877935867]
We propose to address both bot detection and the quantification of uncertainty at the account level.
This dual focus is crucial as it allows us to leverage additional information related to the quantified uncertainty of each prediction.
Specifically, our approach facilitates targeted interventions for bots when predictions are made with high confidence and suggests caution (e.g., gathering more data) when predictions are uncertain.
arXiv Detail & Related papers (2024-07-18T22:33:52Z) - My Brother Helps Me: Node Injection Based Adversarial Attack on Social Bot Detection [69.99192868521564]
Social platforms such as Twitter are under siege from a multitude of fraudulent users.
Due to the structure of social networks, the majority of methods are based on the graph neural network(GNN), which is susceptible to attacks.
We propose a node injection-based adversarial attack method designed to deceive bot detection models.
arXiv Detail & Related papers (2023-10-11T03:09:48Z) - BotArtist: Generic approach for bot detection in Twitter via semi-automatic machine learning pipeline [47.61306219245444]
Twitter has become a target for bots and fake accounts, resulting in the spread of false information and manipulation.
This paper introduces a semi-automatic machine learning pipeline (SAMLP) designed to address the challenges correlated with machine learning model development.
We develop a comprehensive bot detection model named BotArtist, based on user profile features.
arXiv Detail & Related papers (2023-05-31T09:12:35Z) - You are a Bot! -- Studying the Development of Bot Accusations on Twitter [1.7626250599622473]
In the absence of ground truth data, researchers may want to tap into the wisdom of the crowd.
Our research presents the first large-scale study of bot accusations on Twitter.
It shows how the term bot became an instrument of dehumanization in social media conversations.
arXiv Detail & Related papers (2023-02-01T16:09:11Z) - Should we agree to disagree about Twitter's bot problem? [1.6317061277457]
We argue how assumptions on bot-likely behavior, the detection approach, and the population inspected can affect the estimation of the percentage of bots on Twitter.
We emphasize the responsibility of platforms to be vigilant, transparent, and unbiased in dealing with threats that may affect their users.
arXiv Detail & Related papers (2022-09-20T21:27:25Z) - Identification of Twitter Bots based on an Explainable ML Framework: the
US 2020 Elections Case Study [72.61531092316092]
This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data.
Supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm.
Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions.
arXiv Detail & Related papers (2021-12-08T14:12:24Z) - Detection of Novel Social Bots by Ensembles of Specialized Classifiers [60.63582690037839]
Malicious actors create inauthentic social media accounts controlled in part by algorithms, known as social bots, to disseminate misinformation and agitate online discussion.
We show that different types of bots are characterized by different behavioral features.
We propose a new supervised learning method that trains classifiers specialized for each class of bots and combines their decisions through the maximum rule.
arXiv Detail & Related papers (2020-06-11T22:59:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.