MasonTigers@LT-EDI-2024: An Ensemble Approach Towards Detecting
Homophobia and Transphobia in Social Media Comments
- URL: http://arxiv.org/abs/2401.14681v2
- Date: Thu, 15 Feb 2024 06:05:13 GMT
- Title: MasonTigers@LT-EDI-2024: An Ensemble Approach Towards Detecting
Homophobia and Transphobia in Social Media Comments
- Authors: Dhiman Goswami, Sadiya Sayara Chowdhury Puspo, Md Nishat Raihan, Al
Nahian Bin Emran
- Abstract summary: We describe our approaches and results for Task 2 of the LT-EDI 2024 Workshop, aimed at detecting homophobia and/or transphobia across ten languages.
Our methodologies include monolingual transformers and ensemble methods, capitalizing on the strengths of each to enhance the performance of the models.
The ensemble models worked well, placing our team, MasonTigers, in the top five for eight of the ten languages, as measured by the macro F1 score.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we describe our approaches and results for Task 2 of the
LT-EDI 2024 Workshop, aimed at detecting homophobia and/or transphobia across
ten languages. Our methodologies include monolingual transformers and ensemble
methods, capitalizing on the strengths of each to enhance the performance of
the models. The ensemble models worked well, placing our team, MasonTigers, in
the top five for eight of the ten languages, as measured by the macro F1 score.
Our work emphasizes the efficacy of ensemble methods in multilingual scenarios,
addressing the complexities of language-specific tasks.
Related papers
- MasonTigers at SemEval-2024 Task 1: An Ensemble Approach for Semantic Textual Relatedness [5.91695168183101]
This paper presents the MasonTigers entry to the SemEval-2024 Task 1 - Semantic Textual Relatedness.
The task encompasses supervised (Track A), unsupervised (Track B), and cross-lingual (Track C) approaches across 14 different languages.
Our approaches achieved rankings ranging from 11th to 21st in Track A, from 1st to 8th in Track B, and from 5th to 12th in Track C.
arXiv Detail & Related papers (2024-03-22T06:47:42Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Hitachi at SemEval-2023 Task 3: Exploring Cross-lingual Multi-task
Strategies for Genre and Framing Detection in Online News [10.435874177179764]
This paper explains the participation of team Hitachi to SemEval-2023 Task 3 "Detecting the genre, the framing, and the persuasion techniques in online news in a multi-lingual setup"
We investigated different cross-lingual and multi-task strategies for training the pretrained language models.
We constructed ensemble models from the results and achieved the highest macro-averaged F1 scores in Italian and Russian genre categorization subtasks.
arXiv Detail & Related papers (2023-03-03T09:12:55Z) - Bag of Tricks for Effective Language Model Pretraining and Downstream
Adaptation: A Case Study on GLUE [93.98660272309974]
This report briefly describes our submission Vega v1 on the General Language Understanding Evaluation leaderboard.
GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.
With our optimized pretraining and fine-tuning strategies, our 1.3 billion model sets new state-of-the-art on 4/9 tasks, achieving the best average score of 91.3.
arXiv Detail & Related papers (2023-02-18T09:26:35Z) - Tencent AI Lab - Shanghai Jiao Tong University Low-Resource Translation
System for the WMT22 Translation Task [49.916963624249355]
This paper describes Tencent AI Lab - Shanghai Jiao Tong University (TAL-SJTU) Low-Resource Translation systems for the WMT22 shared task.
We participate in the general translation task on English$Leftrightarrow$Livonian.
Our system is based on M2M100 with novel techniques that adapt it to the target language pair.
arXiv Detail & Related papers (2022-10-17T04:34:09Z) - Unsupervised Cross-lingual Adaptation for Sequence Tagging and Beyond [58.80417796087894]
Cross-lingual adaptation with multilingual pre-trained language models (mPTLMs) mainly consists of two lines of works: zero-shot approach and translation-based approach.
We propose a novel framework to consolidate the zero-shot approach and the translation-based approach for better adaptation performance.
arXiv Detail & Related papers (2020-10-23T13:47:01Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - Cross-Lingual Transfer Learning for Complex Word Identification [0.3437656066916039]
Complex Word Identification (CWI) is a task centered on detecting hard-to-understand words in texts.
Our approach uses zero-shot, one-shot, and few-shot learning techniques, alongside state-of-the-art solutions for Natural Language Processing (NLP) tasks.
Our aim is to provide evidence that the proposed models can learn the characteristics of complex words in a multilingual environment.
arXiv Detail & Related papers (2020-10-02T17:09:47Z) - On Learning Universal Representations Across Languages [37.555675157198145]
We extend existing approaches to learn sentence-level representations and show the effectiveness on cross-lingual understanding and generation.
Specifically, we propose a Hierarchical Contrastive Learning (HiCTL) method to learn universal representations for parallel sentences distributed in one or multiple languages.
We conduct evaluations on two challenging cross-lingual tasks, XTREME and machine translation.
arXiv Detail & Related papers (2020-07-31T10:58:39Z) - Growing Together: Modeling Human Language Learning With n-Best
Multi-Checkpoint Machine Translation [8.9379057739817]
We view MT models at various training stages as human learners at different levels.
We employ an ensemble of multi-checkpoints from the same model to generate translation sequences with various levels of fluency.
We achieve 37.57 macro F1 with a 6 checkpoint model ensemble on the official English to Portuguese shared task test data.
arXiv Detail & Related papers (2020-06-07T05:46:15Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.