BERT or FastText? A Comparative Analysis of Contextual as well as Non-Contextual Embeddings
- URL: http://arxiv.org/abs/2411.17661v1
- Date: Tue, 26 Nov 2024 18:25:57 GMT
- Title: BERT or FastText? A Comparative Analysis of Contextual as well as Non-Contextual Embeddings
- Authors: Abhay Shanbhag, Suramya Jadhav, Amogh Thakurdesai, Ridhima Sinare, Raviraj Joshi,
- Abstract summary: The choice of embeddings plays a critical role in enhancing the performance of NLP tasks.
In this study, we investigate the impact of various embedding techniques- Contextual BERT-based, Non-Contextual BERT-based, and FastText-based on NLP classification tasks specific to the Marathi language.
- Score: 0.4194295877935868
- License:
- Abstract: Natural Language Processing (NLP) for low-resource languages presents significant challenges, particularly due to the scarcity of high-quality annotated data and linguistic resources. The choice of embeddings plays a critical role in enhancing the performance of NLP tasks, such as news classification, sentiment analysis, and hate speech detection, especially for low-resource languages like Marathi. In this study, we investigate the impact of various embedding techniques- Contextual BERT-based, Non-Contextual BERT-based, and FastText-based on NLP classification tasks specific to the Marathi language. Our research includes a thorough evaluation of both compressed and uncompressed embeddings, providing a comprehensive overview of how these embeddings perform across different scenarios. Specifically, we compare two BERT model embeddings, Muril and MahaBERT, as well as two FastText model embeddings, IndicFT and MahaFT. Our evaluation includes applying embeddings to a Multiple Logistic Regression (MLR) classifier for task performance assessment, as well as TSNE visualizations to observe the spatial distribution of these embeddings. The results demonstrate that contextual embeddings outperform non-contextual embeddings. Furthermore, BERT-based non-contextual embeddings extracted from the first BERT embedding layer yield better results than FastText-based embeddings, suggesting a potential alternative to FastText embeddings.
Related papers
- Chain-of-Translation Prompting (CoTR): A Novel Prompting Technique for Low Resource Languages [0.4499833362998489]
Chain of Translation Prompting (CoTR) is a novel strategy designed to enhance the performance of language models in low-resource languages.
CoTR restructures prompts to first translate the input context from a low-resource language into a higher-resource language, such as English.
We demonstrate the effectiveness of this method through a case study on the low-resource Indic language Marathi.
arXiv Detail & Related papers (2024-09-06T17:15:17Z) - Enhancing Depressive Post Detection in Bangla: A Comparative Study of TF-IDF, BERT and FastText Embeddings [0.0]
This study introduces a well-grounded approach to identify depressive social media posts in Bangla.
The dataset used in this work, annotated by domain experts, includes both depressive and non-depressive posts.
To address the issue of class imbalance, we utilised random oversampling for the minority class.
arXiv Detail & Related papers (2024-07-12T11:40:17Z) - MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer [50.40191599304911]
We introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer.
In this paper, we present the first framework that leverages relative representations to construct a common space for the embeddings of a source language PLM and the static word embeddings of a target language.
We show that although our proposed framework is competitive with weak baselines when addressing MoSECroT, it fails to achieve competitive results compared with some strong baselines.
arXiv Detail & Related papers (2024-01-09T21:09:07Z) - BERT for Sentiment Analysis: Pre-trained and Fine-Tuned Alternatives [0.0]
BERT has revolutionized the NLP field by enabling transfer learning with large language models.
This article studies how to better cope with the different embeddings provided by the BERT output layer and the usage of language-specific instead of multilingual models.
arXiv Detail & Related papers (2022-01-10T15:05:05Z) - Knowledge-Rich BERT Embeddings for Readability Assessment [0.0]
We propose an alternative way of utilizing the information-rich embeddings of BERT models through a joint-learning method.
Results show that the proposed method outperforms classical approaches in readability assessment using English and Filipino datasets.
arXiv Detail & Related papers (2021-06-15T07:37:48Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - Intrinsic Probing through Dimension Selection [69.52439198455438]
Most modern NLP systems make use of pre-trained contextual representations that attain astonishingly high performance on a variety of tasks.
Such high performance should not be possible unless some form of linguistic structure inheres in these representations, and a wealth of research has sprung up on probing for it.
In this paper, we draw a distinction between intrinsic probing, which examines how linguistic information is structured within a representation, and the extrinsic probing popular in prior work, which only argues for the presence of such information by showing that it can be successfully extracted.
arXiv Detail & Related papers (2020-10-06T15:21:08Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Exploring Cross-sentence Contexts for Named Entity Recognition with BERT [1.4998865865537996]
We present a study exploring the use of cross-sentence information for NER using BERT models in five languages.
We find that adding context in the form of additional sentences to BERT input increases NER performance on all of the tested languages and models.
We propose a straightforward method, Contextual Majority Voting (CMV), to combine different predictions for sentences and demonstrate this to further increase NER performance with BERT.
arXiv Detail & Related papers (2020-06-02T12:34:52Z) - Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining.
We distill the approximate marginal distribution over words in context from the syntactic LM.
Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z) - Coreferential Reasoning Learning for Language Representation [88.14248323659267]
We present CorefBERT, a novel language representation model that can capture the coreferential relations in context.
The experimental results show that, compared with existing baseline models, CorefBERT can achieve significant improvements consistently on various downstream NLP tasks.
arXiv Detail & Related papers (2020-04-15T03:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.