Predicting Issue Types with seBERT
- URL: http://arxiv.org/abs/2205.01335v1
- Date: Tue, 3 May 2022 06:47:13 GMT
- Title: Predicting Issue Types with seBERT
- Authors: Alexander Trautsch, Steffen Herbold
- Abstract summary: seBERT is a model that was developed based on the BERT architecture, but trained from scratch with software engineering data.
We fine-tuned this model for the NLBSE challenge for the task of issue type prediction.
Our model dominates the baseline fastText for all three issue types in both recall and precisio to achieve an overall F1-score of 85.7%.
- Score: 85.74803351913695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained transformer models are the current state-of-the-art for natural
language models processing. seBERT is such a model, that was developed based on
the BERT architecture, but trained from scratch with software engineering data.
We fine-tuned this model for the NLBSE challenge for the task of issue type
prediction. Our model dominates the baseline fastText for all three issue types
in both recall and precisio} to achieve an overall F1-score of 85.7%, which is
an increase of 4.1% over the baseline.
Related papers
- On Robustness of Finetuned Transformer-based NLP Models [11.063628128069736]
We characterize changes between pretrained and finetuned language model representations across layers using two metrics: CKA and STIR.
GPT-2 representations are more robust than BERT and T5 across multiple types of input perturbations.
This study provides valuable insights into perturbation-specific weaknesses of popular Transformer-based models.
arXiv Detail & Related papers (2023-05-23T18:25:18Z) - Model-Generated Pretraining Signals Improves Zero-Shot Generalization of
Text-to-Text Transformers [98.30298332661323]
This paper explores the effectiveness of model-generated signals in improving zero-shot generalization of text-to-text Transformers such as T5.
We develop a new model, METRO-T0, which is pretrained using the redesigned ELECTRA-Style pretraining strategies and then prompt-finetuned on a mixture of NLP tasks.
Our analysis on model's neural activation and parameter sensitivity reveals that the effectiveness of METRO-T0 stems from more balanced contribution of parameters and better utilization of their capacity.
arXiv Detail & Related papers (2023-05-21T21:06:23Z) - Transformer-based approaches to Sentiment Detection [55.41644538483948]
We examined the performance of four different types of state-of-the-art transformer models for text classification.
The RoBERTa transformer model performs best on the test dataset with a score of 82.6% and is highly recommended for quality predictions.
arXiv Detail & Related papers (2023-03-13T17:12:03Z) - DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with
Gradient-Disentangled Embedding Sharing [117.41016786835452]
This paper presents a new pre-trained language model, DeBERTaV3, which improves the original DeBERTa model.
vanilla embedding sharing in ELECTRA hurts training efficiency and model performance.
We propose a new gradient-disentangled embedding sharing method that avoids the tug-of-war dynamics.
arXiv Detail & Related papers (2021-11-18T06:48:00Z) - bert2BERT: Towards Reusable Pretrained Language Models [51.078081486422896]
We propose bert2BERT, which can effectively transfer the knowledge of an existing smaller pre-trained model to a large model.
bert2BERT saves about 45% and 47% computational cost of pre-training BERT_BASE and GPT_BASE by reusing the models of almost their half sizes.
arXiv Detail & Related papers (2021-10-14T04:05:25Z) - A Comparative Study of Transformer-Based Language Models on Extractive
Question Answering [0.5079811885340514]
We train various pre-trained language models and fine-tune them on multiple question answering datasets.
Using the F1-score as our metric, we find that the RoBERTa and BART pre-trained models perform the best across all datasets.
arXiv Detail & Related papers (2021-10-07T02:23:19Z) - syrapropa at SemEval-2020 Task 11: BERT-based Models Design For
Propagandistic Technique and Span Detection [2.0051855303186046]
We first build the model for Span Identification (SI) based on SpanBERT, and facilitate the detection by a deeper model and a sentence-level representation.
We then develop a hybrid model for the Technique Classification (TC)
The hybrid model is composed of three submodels including two BERT models with different training methods, and a feature-based Logistic Regression model.
arXiv Detail & Related papers (2020-08-24T02:15:29Z) - DeBERTa: Decoding-enhanced BERT with Disentangled Attention [119.77305080520718]
We propose a new model architecture DeBERTa that improves the BERT and RoBERTa models using two novel techniques.
We show that these techniques significantly improve the efficiency of model pre-training and the performance of both natural language understanding (NLU) and natural langauge generation (NLG) downstream tasks.
arXiv Detail & Related papers (2020-06-05T19:54:34Z) - Data Augmentation using Pre-trained Transformer Models [2.105564340986074]
We study different types of transformer based pre-trained models such as auto-regressive models (GPT-2), auto-encoder models (BERT), and seq2seq models (BART) for conditional data augmentation.
We show that prepending the class labels to text sequences provides a simple yet effective way to condition the pre-trained models for data augmentation.
arXiv Detail & Related papers (2020-03-04T18:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.