Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models
- URL: http://arxiv.org/abs/2404.17985v1
- Date: Sat, 27 Apr 2024 19:17:31 GMT
- Title: Detection of Conspiracy Theories Beyond Keyword Bias in German-Language Telegram Using Large Language Models
- Authors: Milena Pustet, Elisabeth Steffen, Helena Mihaljević,
- Abstract summary: This work addresses the task of detecting conspiracy theories in German Telegram messages.
We compare the performance of supervised fine-tuning approaches using BERT-like models with prompt-based approaches.
For supervised fine-tuning, we report an F1 score of $sim!! 0.8$ for the positive class, making our model comparable to recent models trained on keyword-focused English corpora.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The automated detection of conspiracy theories online typically relies on supervised learning. However, creating respective training data requires expertise, time and mental resilience, given the often harmful content. Moreover, available datasets are predominantly in English and often keyword-based, introducing a token-level bias into the models. Our work addresses the task of detecting conspiracy theories in German Telegram messages. We compare the performance of supervised fine-tuning approaches using BERT-like models with prompt-based approaches using Llama2, GPT-3.5, and GPT-4 which require little or no additional training data. We use a dataset of $\sim\!\! 4,000$ messages collected during the COVID-19 pandemic, without the use of keyword filters. Our findings demonstrate that both approaches can be leveraged effectively: For supervised fine-tuning, we report an F1 score of $\sim\!\! 0.8$ for the positive class, making our model comparable to recent models trained on keyword-focused English corpora. We demonstrate our model's adaptability to intra-domain temporal shifts, achieving F1 scores of $\sim\!\! 0.7$. Among prompting variants, the best model is GPT-4, achieving an F1 score of $\sim\!\! 0.8$ for the positive class in a zero-shot setting and equipped with a custom conspiracy theory definition.
Related papers
- ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition [18.884124657093405]
We introduce ELLEN, a simple, fully modular, neuro-symbolic method that blends fine-tuned language models with linguistic rules.
ELLEN achieves very strong performance on the CoNLL-2003 dataset.
In a zero-shot setting, ELLEN also achieves over 75% of the performance of a strong, fully supervised model trained on gold data.
arXiv Detail & Related papers (2024-03-26T05:11:51Z) - Evaluating Named Entity Recognition: Comparative Analysis of Mono- and Multilingual Transformer Models on Brazilian Corporate Earnings Call Transcriptions [3.809702129519642]
This study focuses on Portuguese-language texts extracted from earnings call transcriptions of Brazilian banks.
By curating a comprehensive dataset comprising 384 transcriptions, we evaluate the performance of monolingual models trained on Portuguese.
Our findings reveal that BERT-based models consistently outperform T5-based models.
arXiv Detail & Related papers (2024-03-18T19:53:56Z) - Zero-Shot Fact-Checking with Semantic Triples and Knowledge Graphs [13.024338745226462]
Instead of operating directly on the claim and evidence sentences, we decompose them into semantic triples augmented using external knowledge graphs.
This allows it to generalize to adversarial datasets and domains that supervised models require specific training data for.
Our empirical results show that our approach outperforms previous zero-shot approaches on FEVER, FEVER-Symmetric, FEVER 2.0, and Climate-FEVER.
arXiv Detail & Related papers (2023-12-19T01:48:31Z) - M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning)
It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario.
Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - WR-ONE2SET: Towards Well-Calibrated Keyphrase Generation [57.11538133231843]
Keyphrase generation aims to automatically generate short phrases summarizing an input document.
The recently emerged ONE2SET paradigm generates keyphrases as a set and has achieved competitive performance.
We propose WR-ONE2SET which extends ONE2SET with an adaptive instance-level cost Weighting strategy and a target Re-assignment mechanism.
arXiv Detail & Related papers (2022-11-13T09:56:24Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition [98.25592165484737]
We propose a more effective pseudo-labeling scheme, called Cross-Model Pseudo-Labeling (CMPL)
CMPL achieves $17.6%$ and $25.1%$ Top-1 accuracy on Kinetics-400 and UCF-101 using only the RGB modality and $1%$ labeled data, respectively.
arXiv Detail & Related papers (2021-12-17T18:59:41Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Unsupervised Subword Modeling Using Autoregressive Pretraining and
Cross-Lingual Phone-Aware Modeling [30.905849959257264]
This study addresses unsupervised subword modeling, i.e., learning feature representations that can distinguish subword units of a language.
The proposed approach adopts a two-stage bottleneck feature (BNF) learning framework, consisting of autoregressive predictive coding ( APC) as a front-end and a DNN-BNF model as a back-end.
The results on Libri-light and the ZeroSpeech 2017 databases show that APC is effective in front-end feature pretraining.
arXiv Detail & Related papers (2020-07-25T19:41:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.