MemeMind at ArAIEval Shared Task: Spotting Persuasive Spans in Arabic Text with Persuasion Techniques Identification
- URL: http://arxiv.org/abs/2408.04540v1
- Date: Thu, 8 Aug 2024 15:49:01 GMT
- Title: MemeMind at ArAIEval Shared Task: Spotting Persuasive Spans in Arabic Text with Persuasion Techniques Identification
- Authors: Md Rafiul Biswas, Zubair Shah, Wajdi Zaghouani,
- Abstract summary: This paper focuses on detecting propagandistic spans and persuasion techniques in Arabic text from tweets and news paragraphs.
Our approach achieved an F1 score of 0.2774, securing the 3rd position in the leaderboard of Task 1.
- Score: 0.10120650818458249
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper focuses on detecting propagandistic spans and persuasion techniques in Arabic text from tweets and news paragraphs. Each entry in the dataset contains a text sample and corresponding labels that indicate the start and end positions of propaganda techniques within the text. Tokens falling within a labeled span were assigned "B" (Begin) or "I" (Inside), "O", corresponding to the specific propaganda technique. Using attention masks, we created uniform lengths for each span and assigned BIO tags to each token based on the provided labels. Then, we used AraBERT-base pre-trained model for Arabic text tokenization and embeddings with a token classification layer to identify propaganda techniques. Our training process involves a two-phase fine-tuning approach. First, we train only the classification layer for a few epochs, followed by full model fine-tuning, updating all parameters. This methodology allows the model to adapt to the specific characteristics of the propaganda detection task while leveraging the knowledge captured by the pre-trained AraBERT model. Our approach achieved an F1 score of 0.2774, securing the 3rd position in the leaderboard of Task 1.
Related papers
- Exposing propaganda: an analysis of stylistic cues comparing human
annotations and machine classification [0.7749297275724032]
This paper investigates the language of propaganda and its stylistic features.
It presents the PPN dataset, composed of news articles extracted from websites identified as propaganda sources.
We propose different NLP techniques to identify the cues used by the annotators, and to compare them with machine classification.
arXiv Detail & Related papers (2024-02-06T07:51:54Z) - HuBERTopic: Enhancing Semantic Representation of HuBERT through
Self-supervision Utilizing Topic Model [62.995175485416]
We propose a new approach to enrich the semantic representation of HuBERT.
An auxiliary topic classification task is added to HuBERT by using topic labels as teachers.
Experimental results demonstrate that our method achieves comparable or better performance than the baseline in most tasks.
arXiv Detail & Related papers (2023-10-06T02:19:09Z) - Hierarchical Multi-Instance Multi-Label Learning for Detecting
Propaganda Techniques [12.483639681339767]
We propose a simple RoBERTa-based model for classifying all spans in an article simultaneously.
We incorporate hierarchical label dependencies by adding an auxiliary classifier for each node in the decision tree.
Our model leads to an absolute improvement of 2.47% micro-F1 over the model from the shared task winning team in a cross-validation setup.
arXiv Detail & Related papers (2023-05-30T21:23:19Z) - ArabGlossBERT: Fine-Tuning BERT on Context-Gloss Pairs for WSD [0.0]
This paper presents our work to fine-tune BERT models for Arabic Word Sense Disambiguation (WSD)
We constructed a dataset of labeled Arabic context-gloss pairs.
Each pair was labeled as True or False and target words in each context were identified and annotated.
arXiv Detail & Related papers (2022-05-19T16:47:18Z) - Pre-trained Token-replaced Detection Model as Few-shot Learner [31.40447168356879]
We propose a novel approach to few-shot learning with pre-trained token-replaced detection models like ELECTRA.
A systematic evaluation on 16 datasets demonstrates that our approach outperforms few-shot learners with pre-trained masked language models.
arXiv Detail & Related papers (2022-03-07T09:47:53Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - Pre-training Language Model Incorporating Domain-specific Heterogeneous Knowledge into A Unified Representation [49.89831914386982]
We propose a unified pre-trained language model (PLM) for all forms of text, including unstructured text, semi-structured text, and well-structured text.
Our approach outperforms the pre-training of plain text using only 1/4 of the data.
arXiv Detail & Related papers (2021-09-02T16:05:24Z) - LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for
Multi-Granular Propaganda Span Identification [70.1903083747775]
This paper describes our submission for the task of Propaganda Span Identification in news articles.
We introduce a BERT-BiLSTM based span-level propaganda classification model that identifies which token spans within the sentence are indicative of propaganda.
arXiv Detail & Related papers (2020-08-11T16:14:47Z) - MatchGAN: A Self-Supervised Semi-Supervised Conditional Generative
Adversarial Network [51.84251358009803]
We present a novel self-supervised learning approach for conditional generative adversarial networks (GANs) under a semi-supervised setting.
We perform augmentation by randomly sampling sensible labels from the label space of the few labelled examples available.
Our method surpasses the baseline with only 20% of the labelled examples used to train the baseline.
arXiv Detail & Related papers (2020-06-11T17:14:55Z) - Leveraging Declarative Knowledge in Text and First-Order Logic for
Fine-Grained Propaganda Detection [139.3415751957195]
We study the detection of propagandistic text fragments in news articles.
We introduce an approach to inject declarative knowledge of fine-grained propaganda techniques.
arXiv Detail & Related papers (2020-04-29T13:46:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.