Pruning for Performance: Efficient Idiom and Metaphor Classification in Low-Resource Konkani Using mBERT
- URL: http://arxiv.org/abs/2506.02005v2
- Date: Mon, 28 Jul 2025 03:40:35 GMT
- Title: Pruning for Performance: Efficient Idiom and Metaphor Classification in Low-Resource Konkani Using mBERT
- Authors: Timothy Do, Pranav Saran, Harshita Poojary, Pranav Prabhu, Sean O'Brien, Vasu Sharma, Kevin Zhu,
- Abstract summary: figurative language expressions pose persistent challenges for natural language processing systems.<n>We present a hybrid model that integrates a pre-trained Multilingual BERT with a bidirectional LSTM and a linear classifier.
- Score: 3.790013563494571
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we address the persistent challenges that figurative language expressions pose for natural language processing (NLP) systems, particularly in low-resource languages such as Konkani. We present a hybrid model that integrates a pre-trained Multilingual BERT (mBERT) with a bidirectional LSTM and a linear classifier. This architecture is fine-tuned on a newly introduced annotated dataset for metaphor classification, developed as part of this work. To improve the model's efficiency, we implement a gradient-based attention head pruning strategy. For metaphor classification, the pruned model achieves an accuracy of 78%. We also applied our pruning approach to expand on an existing idiom classification task, achieving 83% accuracy. These results demonstrate the effectiveness of attention head pruning for building efficient NLP tools in underrepresented languages.
Related papers
- MAGE: Multi-Head Attention Guided Embeddings for Low Resource Sentiment Classification [0.19381162067627603]
We introduce an advanced model combining Language-Independent Data Augmentation (LiDA) with Multi-Head Attention based weighted embeddings.<n>This approach not only addresses the data scarcity issue but also sets a foundation for future research in low-resource language processing and classification tasks.
arXiv Detail & Related papers (2025-02-25T08:53:27Z) - How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario [72.02391485962127]
Speech Self-Supervised Learning (SSL) models achieve impressive performance on Automatic Speech Recognition (ASR)<n>In low-resource language ASR, they encounter the domain mismatch problem between pre-trained and low-resource languages.<n>We extend a conventional efficient fine-tuning scheme based on the adapter to handle these issues.
arXiv Detail & Related papers (2024-11-27T10:51:00Z) - Evaluating Large Language Models Using Contrast Sets: An Experimental Approach [0.0]
We introduce an innovative technique for generating a contrast set for the Stanford Natural Language Inference dataset.
Our strategy involves the automated substitution of verbs, adverbs, and adjectives with their synonyms to preserve the original meaning of sentences.
This method aims to assess whether a model's performance is based on genuine language comprehension or simply on pattern recognition.
arXiv Detail & Related papers (2024-04-02T02:03:28Z) - MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer [50.40191599304911]
We introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer.
In this paper, we present the first framework that leverages relative representations to construct a common space for the embeddings of a source language PLM and the static word embeddings of a target language.
We show that although our proposed framework is competitive with weak baselines when addressing MoSECroT, it fails to achieve competitive results compared with some strong baselines.
arXiv Detail & Related papers (2024-01-09T21:09:07Z) - Strategies for improving low resource speech to text translation relying
on pre-trained ASR models [59.90106959717875]
This paper presents techniques and findings for improving the performance of low-resource speech to text translation (ST)
We conducted experiments on both simulated and real-low resource setups, on language pairs English - Portuguese, and Tamasheq - French respectively.
arXiv Detail & Related papers (2023-05-31T21:58:07Z) - Attention is Not Always What You Need: Towards Efficient Classification
of Domain-Specific Text [1.1508304497344637]
For large-scale IT corpora with hundreds of classes organized in a hierarchy, the task of accurate classification of classes at the higher level in the hierarchies is crucial.
In the business world, an efficient and explainable ML model is preferred over an expensive black-box model, especially if the performance increase is marginal.
Despite the widespread use of PLMs, there is a lack of a clear and well-justified need to as why these models are being employed for domain-specific text classification.
arXiv Detail & Related papers (2023-03-31T03:17:23Z) - A Unified Neural Network Model for Readability Assessment with Feature
Projection and Length-Balanced Loss [17.213602354715956]
We propose a BERT-based model with feature projection and length-balanced loss for readability assessment.
Our model achieves state-of-the-art performances on two English benchmark datasets and one dataset of Chinese textbooks.
arXiv Detail & Related papers (2022-10-19T05:33:27Z) - KinyaBERT: a Morphology-aware Kinyarwanda Language Model [1.2183405753834562]
Unsupervised sub-word tokenization methods are sub-optimal at handling morphologically rich languages.
We propose a simple yet effective two-tier BERT architecture that leverages a morphological analyzer and explicitly represents morphological compositionality.
We evaluate our proposed method on the low-resource morphologically rich Kinyarwanda language, naming the proposed model architecture KinyaBERT.
arXiv Detail & Related papers (2022-03-16T08:36:14Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z) - Building Low-Resource NER Models Using Non-Speaker Annotation [58.78968578460793]
Cross-lingual methods have had notable success in addressing these concerns.
We propose a complementary approach to building low-resource Named Entity Recognition (NER) models using non-speaker'' (NS) annotations.
We show that use of NS annotators produces results that are consistently on par or better than cross-lingual methods built on modern contextual representations.
arXiv Detail & Related papers (2020-06-17T03:24:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.