Enhancing Grammatical Error Detection using BERT with Cleaned Lang-8 Dataset
- URL: http://arxiv.org/abs/2411.15523v1
- Date: Sat, 23 Nov 2024 10:57:41 GMT
- Title: Enhancing Grammatical Error Detection using BERT with Cleaned Lang-8 Dataset
- Authors: Rahul Nihalani, Kushal Shah,
- Abstract summary: This paper presents an improved LLM based model for Grammatical Error Detection (GED)
Traditional approach to GED involved hand-designed features, but recently, Neural Networks (NN) have automated the discovery of these features.
BERT-base-uncased model gave an impressive performance with an F1 score of 0.91 and accuracy of 98.49% on training data.
- Score: 0.0
- License:
- Abstract: This paper presents an improved LLM based model for Grammatical Error Detection (GED), which is a very challenging and equally important problem for many applications. The traditional approach to GED involved hand-designed features, but recently, Neural Networks (NN) have automated the discovery of these features, improving performance in GED. Traditional rule-based systems have an F1 score of 0.50-0.60 and earlier machine learning models give an F1 score of 0.65-0.75, including decision trees and simple neural networks. Previous deep learning models, for example, Bi-LSTM, have reported F1 scores within the range from 0.80 to 0.90. In our study, we have fine-tuned various transformer models using the Lang8 dataset rigorously cleaned by us. In our experiments, the BERT-base-uncased model gave an impressive performance with an F1 score of 0.91 and accuracy of 98.49% on training data and 90.53% on testing data, also showcasing the importance of data cleaning. Increasing model size using BERT-large-uncased or RoBERTa-large did not give any noticeable improvements in performance or advantage for this task, underscoring that larger models are not always better. Our results clearly show how far rigorous data cleaning and simple transformer-based models can go toward significantly improving the quality of GED.
Related papers
- Efficient Auto-Labeling of Large-Scale Poultry Datasets (ALPD) Using Semi-Supervised Models, Active Learning, and Prompt-then-Detect Approach [4.6951658997946755]
The rapid growth of AI in poultry farming has highlighted the challenge of efficiently labeling large, diverse datasets.
This study explores semi-supervised auto-labeling methods, integrating active learning, and prompt-then-detect paradigm.
arXiv Detail & Related papers (2025-01-18T16:20:04Z) - Comparison of Machine Learning Approaches for Classifying Spinodal Events [3.030969076856776]
We evaluate state-of-the-art models (MobileViT, NAT, EfficientNet, CNN) alongside several ensemble models (majority voting, AdaBoost)
Our findings show that NAT and MobileViT outperform other models, achieving the highest metrics-accuracy, AUC, and F1 score on both training and testing data.
arXiv Detail & Related papers (2024-10-13T07:27:00Z) - A Comparative Study of Hybrid Models in Health Misinformation Text Classification [0.43695508295565777]
This study evaluates the effectiveness of machine learning (ML) and deep learning (DL) models in detecting COVID-19-related misinformation on online social networks (OSNs)
Our study concludes that DL and hybrid DL models are more effective than conventional ML algorithms for detecting COVID-19 misinformation on OSNs.
arXiv Detail & Related papers (2024-10-08T19:43:37Z) - Learning from Negative Samples in Generative Biomedical Entity Linking [20.660717375784596]
We introduce ANGEL, the first framework that trains generative BioEL models using negative samples.
Our models fine-tuned with ANGEL outperform the previous best baseline models by up to an average top-1 accuracy of 1.4% on five benchmarks.
arXiv Detail & Related papers (2024-08-29T12:44:01Z) - Convolutional Neural Networks for the classification of glitches in
gravitational-wave data streams [52.77024349608834]
We classify transient noise signals (i.e.glitches) and gravitational waves in data from the Advanced LIGO detectors.
We use models with a supervised learning approach, both trained from scratch using the Gravity Spy dataset.
We also explore a self-supervised approach, pre-training models with automatically generated pseudo-labels.
arXiv Detail & Related papers (2023-03-24T11:12:37Z) - Exploring the Value of Pre-trained Language Models for Clinical Named
Entity Recognition [6.917786124918387]
We compare Transformer models that are trained from scratch to fine-tuned BERT-based LLMs.
We examine the impact of an additional CRF layer on such models to encourage contextual learning.
arXiv Detail & Related papers (2022-10-23T16:27:31Z) - Hyperparameter-free Continuous Learning for Domain Classification in
Natural Language Understanding [60.226644697970116]
Domain classification is the fundamental task in natural language understanding (NLU)
Most existing continual learning approaches suffer from low accuracy and performance fluctuation.
We propose a hyper parameter-free continual learning model for text data that can stably produce high performance under various environments.
arXiv Detail & Related papers (2022-01-05T02:46:16Z) - DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with
Gradient-Disentangled Embedding Sharing [117.41016786835452]
This paper presents a new pre-trained language model, DeBERTaV3, which improves the original DeBERTa model.
vanilla embedding sharing in ELECTRA hurts training efficiency and model performance.
We propose a new gradient-disentangled embedding sharing method that avoids the tug-of-war dynamics.
arXiv Detail & Related papers (2021-11-18T06:48:00Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - DeBERTa: Decoding-enhanced BERT with Disentangled Attention [119.77305080520718]
We propose a new model architecture DeBERTa that improves the BERT and RoBERTa models using two novel techniques.
We show that these techniques significantly improve the efficiency of model pre-training and the performance of both natural language understanding (NLU) and natural langauge generation (NLG) downstream tasks.
arXiv Detail & Related papers (2020-06-05T19:54:34Z) - TACRED Revisited: A Thorough Evaluation of the TACRED Relation
Extraction Task [80.38130122127882]
TACRED is one of the largest, most widely used crowdsourced datasets in Relation Extraction (RE)
In this paper, we investigate the questions: Have we reached a performance ceiling or is there still room for improvement?
We find that label errors account for 8% absolute F1 test error, and that more than 50% of the examples need to be relabeled.
arXiv Detail & Related papers (2020-04-30T15:07:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.