nlpBDpatriots at BLP-2023 Task 1: A Two-Step Classification for Violence
Inciting Text Detection in Bangla
- URL: http://arxiv.org/abs/2311.15029v1
- Date: Sat, 25 Nov 2023 13:47:34 GMT
- Title: nlpBDpatriots at BLP-2023 Task 1: A Two-Step Classification for Violence
Inciting Text Detection in Bangla
- Authors: Md Nishat Raihan, Dhiman Goswami, Sadiya Sayara Chowdhury Puspo,
Marcos Zampieri
- Abstract summary: In this paper, we discuss the nlpBDpatriots entry to the shared task on Violence Inciting Text Detection (VITD)
The aim of this task is to identify and classify the violent threats, that provoke further unlawful violent acts.
Our best-performing approach for the task is two-step classification using back translation and multilinguality which ranked 6th out of 27 teams with a macro F1 score of 0.74.
- Score: 7.3481279783709805
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we discuss the nlpBDpatriots entry to the shared task on
Violence Inciting Text Detection (VITD) organized as part of the first workshop
on Bangla Language Processing (BLP) co-located with EMNLP. The aim of this task
is to identify and classify the violent threats, that provoke further unlawful
violent acts. Our best-performing approach for the task is two-step
classification using back translation and multilinguality which ranked 6th out
of 27 teams with a macro F1 score of 0.74.
Related papers
- Mavericks at BLP-2023 Task 1: Ensemble-based Approach Using Language
Models for Violence Inciting Text Detection [0.0]
Social media has accelerated the propagation of hate and violence-inciting speech in society.
The problem of detecting violence-inciting texts is further exacerbated in low-resource settings due to sparse research and less data.
This paper presents our work for the Violence Inciting Text Detection shared task in the First Workshop on Bangla Language Processing.
arXiv Detail & Related papers (2023-11-30T18:23:38Z) - nlpBDpatriots at BLP-2023 Task 2: A Transfer Learning Approach to Bangla
Sentiment Analysis [7.3481279783709805]
In this paper, we discuss the nlpBDpatriots entry to the shared task on Sentiment Analysis of Bangla Social Media Posts.
The main objective of this task is to identify the polarity of social media content using a Bangla dataset annotated with positive, neutral, and negative labels.
Our best system ranked 12th among 30 teams that participated in the competition.
arXiv Detail & Related papers (2023-11-25T13:58:58Z) - BanglaNLP at BLP-2023 Task 1: Benchmarking different Transformer Models
for Violence Inciting Text Detection in Bengali [0.46040036610482665]
This paper presents the system that we have developed while solving this shared task on violence inciting text detection in Bangla.
We explain both the traditional and the recent approaches that we used to make our models learn.
Our proposed system helps to classify if the given text contains any threat.
arXiv Detail & Related papers (2023-10-16T19:35:04Z) - Bag of Tricks for Effective Language Model Pretraining and Downstream
Adaptation: A Case Study on GLUE [93.98660272309974]
This report briefly describes our submission Vega v1 on the General Language Understanding Evaluation leaderboard.
GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.
With our optimized pretraining and fine-tuning strategies, our 1.3 billion model sets new state-of-the-art on 4/9 tasks, achieving the best average score of 91.3.
arXiv Detail & Related papers (2023-02-18T09:26:35Z) - Overview of Abusive and Threatening Language Detection in Urdu at FIRE
2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language.
We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening.
For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z) - RuArg-2022: Argument Mining Evaluation [69.87149207721035]
This paper is a report of the organizers on the first competition of argumentation analysis systems dealing with Russian language texts.
A corpus containing 9,550 sentences (comments on social media posts) on three topics related to the COVID-19 pandemic was prepared.
The system that won the first place in both tasks used the NLI (Natural Language Inference) variant of the BERT architecture.
arXiv Detail & Related papers (2022-06-18T17:13:37Z) - Fine-tuning of Pre-trained Transformers for Hate, Offensive, and Profane
Content Detection in English and Marathi [0.0]
This paper describes neural models developed for the Hate Speech and Offensive Content Identification in English and Indo-Aryan languages.
For English subtasks, we investigate the impact of additional corpora for hate speech detection to fine-tune transformer models.
For the Marathi tasks, we propose a system based on the Language-Agnostic BERT Sentence Embedding (LaBSE)
arXiv Detail & Related papers (2021-10-25T07:11:02Z) - Learning to Selectively Learn for Weakly-supervised Paraphrase
Generation [81.65399115750054]
We propose a novel approach to generate high-quality paraphrases with weak supervision data.
Specifically, we tackle the weakly-supervised paraphrase generation problem by:.
obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion.
We demonstrate that our approach achieves significant improvements over existing unsupervised approaches, and is even comparable in performance with supervised state-of-the-arts.
arXiv Detail & Related papers (2021-09-25T23:31:13Z) - Unsupervised Bitext Mining and Translation via Self-trained Contextual
Embeddings [51.47607125262885]
We describe an unsupervised method to create pseudo-parallel corpora for machine translation (MT) from unaligned text.
We use multilingual BERT to create source and target sentence embeddings for nearest-neighbor search and adapt the model via self-training.
We validate our technique by extracting parallel sentence pairs on the BUCC 2017 bitext mining task and observe up to a 24.5 point increase (absolute) in F1 scores over previous unsupervised methods.
arXiv Detail & Related papers (2020-10-15T14:04:03Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.