Research on Named Entity Recognition in Improved transformer with R-Drop
structure
- URL: http://arxiv.org/abs/2306.08315v1
- Date: Wed, 14 Jun 2023 07:34:27 GMT
- Title: Research on Named Entity Recognition in Improved transformer with R-Drop
structure
- Authors: Weidong Ji, Yousheng Zhang, Guohui Zhou, Xu Wang
- Abstract summary: The XLNet-Transformer-R model is proposed in this paper.
The Transformer encoder with relative positional encodings are combined to enhance the model's ability to process long text.
To prevent overfitting, the R-Drop structure is used to improve the generalization capability.
- Score: 3.677017987610888
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To enhance the generalization ability of the model and improve the
effectiveness of the transformer for named entity recognition tasks, the
XLNet-Transformer-R model is proposed in this paper. The XLNet pre-trained
model and the Transformer encoder with relative positional encodings are
combined to enhance the model's ability to process long text and learn
contextual information to improve robustness. To prevent overfitting, the
R-Drop structure is used to improve the generalization capability and enhance
the accuracy of the model in named entity recognition tasks. The model in this
paper performs ablation experiments on the MSRA dataset and comparison
experiments with other models on four datasets with excellent performance,
demonstrating the strategic effectiveness of the XLNet-Transformer-R model.
Related papers
- Multi-Level Attention and Contrastive Learning for Enhanced Text Classification with an Optimized Transformer [0.0]
This paper studies a text classification algorithm based on an improved Transformer to improve the performance and efficiency of the model in text classification tasks.
The improved Transformer model outperforms the comparative models such as BiLSTM, CNN, standard Transformer, and BERT in terms of classification accuracy, F1 score, and recall rate.
arXiv Detail & Related papers (2025-01-23T08:32:27Z) - IG-CFAT: An Improved GAN-Based Framework for Effectively Exploiting Transformers in Real-World Image Super-Resolution [2.1561701531034414]
Recently, composite fusion attention transformer (CFAT) outperformed previous state-of-the-art (SOTA) models in classic image super-resolution.
In this paper, we propose a novel GAN-based framework by incorporating the CFAT model to effectively exploit the performance of transformers in real-world image super-resolution.
arXiv Detail & Related papers (2024-06-19T20:21:26Z) - TransAxx: Efficient Transformers with Approximate Computing [4.347898144642257]
Vision Transformer (ViT) models have shown to be very competitive and often become a popular alternative to Convolutional Neural Networks (CNNs)
We propose TransAxx, a framework based on the popular PyTorch library that enables fast inherent support for approximate arithmetic.
Our approach uses a Monte Carlo Tree Search (MCTS) algorithm to efficiently search the space of possible configurations.
arXiv Detail & Related papers (2024-02-12T10:16:05Z) - Transformer-based approaches to Sentiment Detection [55.41644538483948]
We examined the performance of four different types of state-of-the-art transformer models for text classification.
The RoBERTa transformer model performs best on the test dataset with a score of 82.6% and is highly recommended for quality predictions.
arXiv Detail & Related papers (2023-03-13T17:12:03Z) - Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications.
The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate.
There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z) - From Environmental Sound Representation to Robustness of 2D CNN Models
Against Adversarial Attacks [82.21746840893658]
This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network.
We show that while the ResNet-18 model trained on DWT spectrograms achieves a high recognition accuracy, attacking this model is relatively more costly for the adversary.
arXiv Detail & Related papers (2022-04-14T15:14:08Z) - Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model [58.17021225930069]
We explain the rationality of Vision Transformer by analogy with the proven practical Evolutionary Algorithm (EA)
We propose a more efficient EAT model, and design task-related heads to deal with different tasks more flexibly.
Our approach achieves state-of-the-art results on the ImageNet classification task compared with recent vision transformer works.
arXiv Detail & Related papers (2021-05-31T16:20:03Z) - Cross-Modal Generative Augmentation for Visual Question Answering [34.9601948665926]
This paper introduces a generative model for data augmentation by leveraging the correlations among multiple modalities.
The proposed model is able to quantify the confidence of augmented data by its generative probability, and can be jointly updated with a downstream pipeline.
arXiv Detail & Related papers (2021-05-11T04:51:26Z) - Visformer: The Vision-friendly Transformer [105.52122194322592]
We propose a new architecture named Visformer, which is abbreviated from the Vision-friendly Transformer'
With the same computational complexity, Visformer outperforms both the Transformer-based and convolution-based models in terms of ImageNet classification accuracy.
arXiv Detail & Related papers (2021-04-26T13:13:03Z) - Transformer-based Conditional Variational Autoencoder for Controllable
Story Generation [39.577220559911055]
We investigate large-scale latent variable models (LVMs) for neural story generation with objectives in two threads: generation effectiveness and controllability.
We advocate to revive latent variable modeling, essentially the power of representation learning, in the era of Transformers.
Specifically, we integrate latent representation vectors with a Transformer-based pre-trained architecture to build conditional variational autoencoder (CVAE)
arXiv Detail & Related papers (2021-01-04T08:31:11Z) - From Sound Representation to Model Robustness [82.21746840893658]
We investigate the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network.
Averaged over various experiments on three environmental sound datasets, we found the ResNet-18 model outperforms other deep learning architectures.
arXiv Detail & Related papers (2020-07-27T17:30:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.