Catch Me If You Can: Deceiving Stance Detection and Geotagging Models to
Protect Privacy of Individuals on Twitter
- URL: http://arxiv.org/abs/2207.11500v1
- Date: Sat, 23 Jul 2022 11:55:18 GMT
- Title: Catch Me If You Can: Deceiving Stance Detection and Geotagging Models to
Protect Privacy of Individuals on Twitter
- Authors: Dilara Dogan, Bahadir Altun, Muhammed Said Zengin, Mucahid Kutlu and
Tamer Elsayed
- Abstract summary: We ground our investigation in two exposure-risky tasks, stance detection and geotagging.
We explore a variety of simple techniques for modifying text, such as inserting typos in salient words, paraphrasing, and adding dummy social media posts.
We find that typos have minimal impact on state-of-the-art geotagging models due to their increased reliance on social networks.
- Score: 3.928604516640069
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The recent advances in natural language processing have yielded many exciting
developments in text analysis and language understanding models; however, these
models can also be used to track people, bringing severe privacy concerns. In
this work, we investigate what individuals can do to avoid being detected by
those models while using social media platforms. We ground our investigation in
two exposure-risky tasks, stance detection and geotagging. We explore a variety
of simple techniques for modifying text, such as inserting typos in salient
words, paraphrasing, and adding dummy social media posts. Our experiments show
that the performance of BERT-based models fined tuned for stance detection
decreases significantly due to typos, but it is not affected by paraphrasing.
Moreover, we find that typos have minimal impact on state-of-the-art geotagging
models due to their increased reliance on social networks; however, we show
that users can deceive those models by interacting with different users,
reducing their performance by almost 50%.
Related papers
- IDT: Dual-Task Adversarial Attacks for Privacy Protection [8.312362092693377]
Methods to protect privacy can involve using representations inside models that are not to detect sensitive attributes.
We propose IDT, a method that analyses predictions made by auxiliary and interpretable models to identify which tokens are important to change.
We evaluate different datasets for NLP suitable for different tasks.
arXiv Detail & Related papers (2024-06-28T04:14:35Z) - Your Large Language Models Are Leaving Fingerprints [1.9561775591923982]
LLMs possess unique fingerprints that manifest as slight differences in the frequency of certain lexical and morphosyntactic features.
We show how to visualize such fingerprints, describe how they can be used to detect machine-generated text and find that they are even robust across textual domains.
arXiv Detail & Related papers (2024-05-22T23:02:42Z) - Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks [63.269788236474234]
We propose to use model pairs on open-set classification tasks for detecting backdoors.
We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures.
This technique allows for the detection of backdoors on models designed for open-set classification tasks, which is little studied in the literature.
arXiv Detail & Related papers (2024-02-28T21:29:16Z) - Few-Shot Detection of Machine-Generated Text using Style Representations [4.326503887981912]
Language models that convincingly mimic human writing pose a significant risk of abuse.
We propose to leverage representations of writing style estimated from human-authored text.
We find that features effective at distinguishing among human authors are also effective at distinguishing human from machine authors.
arXiv Detail & Related papers (2024-01-12T17:26:51Z) - Unified Visual Relationship Detection with Vision and Language Models [89.77838890788638]
This work focuses on training a single visual relationship detector predicting over the union of label spaces from multiple datasets.
We propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models.
Empirical results on both human-object interaction detection and scene-graph generation demonstrate the competitive performance of our model.
arXiv Detail & Related papers (2023-03-16T00:06:28Z) - Verifying the Robustness of Automatic Credibility Assessment [79.08422736721764]
Text classification methods have been widely investigated as a way to detect content of low credibility.
In some cases insignificant changes in input text can mislead the models.
We introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - How would Stance Detection Techniques Evolve after the Launch of ChatGPT? [5.756359016880821]
A new pre-trained language model chatGPT was launched on Nov 30, 2022.
ChatGPT can achieve SOTA or similar performance for commonly used datasets including SemEval-2016 and P-Stance.
ChatGPT has the potential to be the best AI model for stance detection tasks in NLP.
arXiv Detail & Related papers (2022-12-30T05:03:15Z) - CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data
Limitation With Contrastive Learning [14.637303913878435]
We present a coherence-based contrastive learning model named CoCo to detect the possible MGT under low-resource scenario.
To exploit the linguistic feature, we encode coherence information in form of graph into text representation.
Experiment results on two public datasets and two self-constructed datasets prove our approach outperforms the state-of-art methods significantly.
arXiv Detail & Related papers (2022-12-20T15:26:19Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - Exploiting Multi-Object Relationships for Detecting Adversarial Attacks
in Complex Scenes [51.65308857232767]
Vision systems that deploy Deep Neural Networks (DNNs) are known to be vulnerable to adversarial examples.
Recent research has shown that checking the intrinsic consistencies in the input data is a promising way to detect adversarial attacks.
We develop a novel approach to perform context consistency checks using language models.
arXiv Detail & Related papers (2021-08-19T00:52:10Z) - Adversarial Attack on Community Detection by Hiding Individuals [68.76889102470203]
We focus on black-box attack and aim to hide targeted individuals from the detection of deep graph community detection models.
We propose an iterative learning framework that takes turns to update two modules: one working as the constrained graph generator and the other as the surrogate community detection model.
arXiv Detail & Related papers (2020-01-22T09:50:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.