Solomon at SemEval-2020 Task 11: Ensemble Architecture for Fine-Tuned
Propaganda Detection in News Articles
- URL: http://arxiv.org/abs/2009.07473v1
- Date: Wed, 16 Sep 2020 05:00:40 GMT
- Title: Solomon at SemEval-2020 Task 11: Ensemble Architecture for Fine-Tuned
Propaganda Detection in News Articles
- Authors: Mayank Raj, Ajay Jaiswal, Rohit R.R, Ankita Gupta, Sudeep Kumar Sahoo,
Vertika Srivastava, Yeon Hyang Kim
- Abstract summary: This paper describes our system (Solomon) details and results of participation in the SemEval 2020 Task 11 "Detection of Propaganda Techniques in News Articles"
We used RoBERTa based transformer architecture for fine-tuning on the propaganda dataset.
Compared to the other participating systems, our submission is ranked 4th on the leaderboard.
- Score: 0.3232625980782302
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes our system (Solomon) details and results of
participation in the SemEval 2020 Task 11 "Detection of Propaganda Techniques
in News Articles"\cite{DaSanMartinoSemeval20task11}. We participated in Task
"Technique Classification" (TC) which is a multi-class classification task. To
address the TC task, we used RoBERTa based transformer architecture for
fine-tuning on the propaganda dataset. The predictions of RoBERTa were further
fine-tuned by class-dependent-minority-class classifiers. A special classifier,
which employs dynamically adapted Least Common Sub-sequence algorithm, is used
to adapt to the intricacies of repetition class. Compared to the other
participating systems, our submission is ranked 4th on the leaderboard.
Related papers
- HuBERTopic: Enhancing Semantic Representation of HuBERT through
Self-supervision Utilizing Topic Model [62.995175485416]
We propose a new approach to enrich the semantic representation of HuBERT.
An auxiliary topic classification task is added to HuBERT by using topic labels as teachers.
Experimental results demonstrate that our method achieves comparable or better performance than the baseline in most tasks.
arXiv Detail & Related papers (2023-10-06T02:19:09Z) - The USYD-JD Speech Translation System for IWSLT 2021 [85.64797317290349]
This paper describes the University of Sydney& JD's joint submission of the IWSLT 2021 low resource speech translation task.
We trained our models with the officially provided ASR and MT datasets.
To achieve better translation performance, we explored the most recent effective strategies, including back translation, knowledge distillation, multi-feature reranking and transductive finetuning.
arXiv Detail & Related papers (2021-07-24T09:53:34Z) - MIDAS at SemEval-2020 Task 10: Emphasis Selection using Label
Distribution Learning and Contextual Embeddings [46.973153861604416]
This paper presents our submission to the SemEval 2020 - Task 10 on emphasis selection in written text.
We approach this emphasis selection problem as a sequence labeling task where we represent the underlying text with contextual embedding models.
Our best performing architecture is an ensemble of different models, which achieved an overall matching score of 0.783, placing us 15th out of 31 participating teams.
arXiv Detail & Related papers (2020-09-06T00:15:33Z) - DUTH at SemEval-2020 Task 11: BERT with Entity Mapping for Propaganda
Classification [1.5469452301122173]
This report describes the methods employed by the Democritus University of Thrace (DUTH) team for participating in SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles.
arXiv Detail & Related papers (2020-08-22T18:18:02Z) - CyberWallE at SemEval-2020 Task 11: An Analysis of Feature Engineering
for Ensemble Models for Propaganda Detection [0.0]
We use a bi-LSTM architecture in the Span Identification subtask and train a complex ensemble model for the Technique Classification subtask.
Our systems achieve a rank of 8 out of 35 teams in the SI subtask and 8 out of 31 teams in the TC subtask.
arXiv Detail & Related papers (2020-08-22T15:51:16Z) - LTIatCMU at SemEval-2020 Task 11: Incorporating Multi-Level Features for
Multi-Granular Propaganda Span Identification [70.1903083747775]
This paper describes our submission for the task of Propaganda Span Identification in news articles.
We introduce a BERT-BiLSTM based span-level propaganda classification model that identifies which token spans within the sentence are indicative of propaganda.
arXiv Detail & Related papers (2020-08-11T16:14:47Z) - SemEval-2020 Task 10: Emphasis Selection for Written Text in Visual
Media [50.29389719723529]
We present the main findings and compare the results of SemEval-2020 Task 10, Emphasis Selection for Written Text in Visual Media.
The goal of this shared task is to design automatic methods for emphasis selection.
The analysis of systems submitted to the task indicates that BERT and RoBERTa were the most common choice of pre-trained models used.
arXiv Detail & Related papers (2020-08-07T17:24:53Z) - aschern at SemEval-2020 Task 11: It Takes Three to Tango: RoBERTa, CRF,
and Transfer Learning [22.90521056447551]
We describe our system for SemEval-2020 Task 11 on Detection of Propaganda Techniques in News Articles.
We developed ensemble models using RoBERTa-based neural architectures, additional CRF layers, transfer learning between the two subtasks, and advanced post-processing to handle the multi-label nature of the task.
arXiv Detail & Related papers (2020-08-06T18:45:25Z) - NoPropaganda at SemEval-2020 Task 11: A Borrowed Approach to Sequence
Tagging and Text Classification [0.0]
This paper describes our contribution to SemEval-2020 Task 11: Detection Of Propaganda Techniques In News Articles.
We start with simple LSTM baselines and move to an autoregressive transformer decoder to predict long continuous propaganda spans for the first subtask.
We also adopt an approach from relation extraction by enveloping spans mentioned above with special tokens for the second subtask of propaganda technique classification.
arXiv Detail & Related papers (2020-07-25T11:35:57Z) - newsSweeper at SemEval-2020 Task 11: Context-Aware Rich Feature
Representations For Propaganda Classification [2.0491741153610334]
This paper describes our submissions to SemEval 2020 Task 11: Detection of Propaganda Techniques in News Articles.
We make use of pre-trained BERT language model enhanced with tagging techniques developed for the task of Named Entity Recognition.
For the second subtask, we incorporate contextual features in a pre-trained RoBERTa model for the classification of propaganda techniques.
arXiv Detail & Related papers (2020-07-21T14:06:59Z) - Device-Robust Acoustic Scene Classification Based on Two-Stage
Categorization and Data Augmentation [63.98724740606457]
We present a joint effort of four groups, namely GT, USTC, Tencent, and UKE, to tackle Task 1 - Acoustic Scene Classification (ASC) in the DCASE 2020 Challenge.
Task 1a focuses on ASC of audio signals recorded with multiple (real and simulated) devices into ten different fine-grained classes.
Task 1b concerns with classification of data into three higher-level classes using low-complexity solutions.
arXiv Detail & Related papers (2020-07-16T15:07:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.