LaDiff ULMFiT: A Layer Differentiated training approach for ULMFiT
- URL: http://arxiv.org/abs/2101.04965v1
- Date: Wed, 13 Jan 2021 09:52:04 GMT
- Title: LaDiff ULMFiT: A Layer Differentiated training approach for ULMFiT
- Authors: Mohammed Azhan, Mohammad Ahmad
- Abstract summary: We propose a Layer Differentiated training procedure for training a pre-trained ULMFiT arXiv:1801.06146 model.
We used special tokens to annotate specific parts of the tweets to improve language understanding and gain insights on the model.
The proposed approach ranked 61st out of 164 in the sub-task "COVID19 Fake News Detection in English"
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In our paper, we present Deep Learning models with a layer differentiated
training method which were used for the SHARED TASK@ CONSTRAINT 2021 sub-tasks
COVID19 Fake News Detection in English and Hostile Post Detection in Hindi. We
propose a Layer Differentiated training procedure for training a pre-trained
ULMFiT arXiv:1801.06146 model. We used special tokens to annotate specific
parts of the tweets to improve language understanding and gain insights on the
model making the tweets more interpretable. The other two submissions included
a modified RoBERTa model and a simple Random Forest Classifier. The proposed
approach scored a precision and f1 score of 0.96728972 and 0.967324832
respectively for sub-task "COVID19 Fake News Detection in English". Also,
Coarse-Grained Hostility f1 Score and Weighted FineGrained f1 score of 0.908648
and 0.533907 respectively for sub-task Hostile Post Detection in Hindi. The
proposed approach ranked 61st out of 164 in the sub-task "COVID19 Fake News
Detection in English and 18th out of 45 in the sub-task Hostile Post Detection
in Hindi".
Related papers
- Bag of Tricks for Effective Language Model Pretraining and Downstream
Adaptation: A Case Study on GLUE [93.98660272309974]
This report briefly describes our submission Vega v1 on the General Language Understanding Evaluation leaderboard.
GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.
With our optimized pretraining and fine-tuning strategies, our 1.3 billion model sets new state-of-the-art on 4/9 tasks, achieving the best average score of 91.3.
arXiv Detail & Related papers (2023-02-18T09:26:35Z) - Overview of Abusive and Threatening Language Detection in Urdu at FIRE
2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language.
We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening.
For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z) - UrduFake@FIRE2021: Shared Track on Fake News Identification in Urdu [55.41644538483948]
This study reports the second shared task named as UrduFake@FIRE2021 on identifying fake news detection in Urdu language.
The proposed systems were based on various count-based features and used different classifiers as well as neural network architectures.
The gradient descent (SGD) algorithm outperformed other classifiers and achieved 0.679 F-score.
arXiv Detail & Related papers (2022-07-11T19:15:04Z) - Sequence-level self-learning with multiple hypotheses [53.04725240411895]
We develop new self-learning techniques with an attention-based sequence-to-sequence (seq2seq) model for automatic speech recognition (ASR)
In contrast to conventional unsupervised learning approaches, we adopt the emphmulti-task learning (MTL) framework.
Our experiment results show that our method can reduce the WER on the British speech data from 14.55% to 10.36% compared to the baseline model trained with the US English data only.
arXiv Detail & Related papers (2021-12-10T20:47:58Z) - From Universal Language Model to Downstream Task: Improving
RoBERTa-Based Vietnamese Hate Speech Detection [8.602181445598776]
We propose a pipeline to adapt the general-purpose RoBERTa language model to a specific text classification task: Vietnamese Hate Speech Detection.
Our experiments proved that our proposed pipeline boosts the performance significantly, achieving a new state-of-the-art on Vietnamese Hate Speech Detection campaign with 0.7221 F1 score.
arXiv Detail & Related papers (2021-02-24T09:30:55Z) - An Attention Ensemble Approach for Efficient Text Classification of
Indian Languages [0.0]
This paper focuses on the coarse-grained technical domain identification of short text documents in Marathi, a Devanagari script-based Indian language.
A hybrid CNN-BiLSTM attention ensemble model is proposed that competently combines the intermediate sentence representations generated by the convolutional neural network and the bidirectional long short-term memory, leading to efficient text classification.
Experimental results show that the proposed model outperforms various baseline machine learning and deep learning models in the given task, giving the best validation accuracy of 89.57% and f1-score of 0.8875.
arXiv Detail & Related papers (2021-02-20T07:31:38Z) - Coarse and Fine-Grained Hostility Detection in Hindi Posts using Fine
Tuned Multilingual Embeddings [4.3012765978447565]
The hostility detection task has been well explored for resource-rich languages like English, but is unexplored for resource-constrained languages like Hindidue to the unavailability of large suitable data.
We propose an effective neural network-based technique for hostility detection in Hindi posts.
arXiv Detail & Related papers (2021-01-13T11:00:31Z) - Detecting Hostile Posts using Relational Graph Convolutional Network [1.8734449181723827]
This work is based on the submission to competition conducted by AAAI@2021 for detection of hostile posts in Hindi on social media platforms.
Here, a model is presented for classification of hostile posts using Convolutional Networks.
The proposed model is performing at par with Google's XLM-RoBERTa on the given dataset.
Among all submissions to the challenge, our classification system with XLMRoberta secured 2nd rank on fine-grained classification.
arXiv Detail & Related papers (2021-01-10T06:50:22Z) - Exploring Text-transformers in AAAI 2021 Shared Task: COVID-19 Fake News
Detection in English [30.61407811064534]
We describe our system for the AAAI 2021 shared task of COVID-19 Fake News Detection in English.
We proposed an ensemble method of different pre-trained language models including BERT, Roberta, Ernie, etc.
We also conduct an extensive analysis of the samples that are not correctly classified.
arXiv Detail & Related papers (2021-01-07T04:01:13Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z) - Kungfupanda at SemEval-2020 Task 12: BERT-Based Multi-Task Learning for
Offensive Language Detection [55.445023584632175]
We build an offensive language detection system, which combines multi-task learning with BERT-based models.
Our model achieves 91.51% F1 score in English Sub-task A, which is comparable to the first place.
arXiv Detail & Related papers (2020-04-28T11:27:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.