ASTRAL: Adversarial Trained LSTM-CNN for Named Entity Recognition
- URL: http://arxiv.org/abs/2009.01041v1
- Date: Wed, 2 Sep 2020 13:15:25 GMT
- Title: ASTRAL: Adversarial Trained LSTM-CNN for Named Entity Recognition
- Authors: Jiuniu Wang, Wenjia Xu, Xingyu Fu, Guangluan Xu, Yirong Wu
- Abstract summary: We propose an Adversarial Trained LSTM-CNN (ASTRAL) system to improve the current NER method from both the model structure and the training process.
Our system is evaluated on three benchmarks, CoNLL-03, OntoNotes 5.0, and WNUT-17, achieving state-of-the-art results.
- Score: 16.43239147870092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Named Entity Recognition (NER) is a challenging task that extracts named
entities from unstructured text data, including news, articles, social
comments, etc. The NER system has been studied for decades. Recently, the
development of Deep Neural Networks and the progress of pre-trained word
embedding have become a driving force for NER. Under such circumstances, how to
make full use of the information extracted by word embedding requires more
in-depth research. In this paper, we propose an Adversarial Trained LSTM-CNN
(ASTRAL) system to improve the current NER method from both the model structure
and the training process. In order to make use of the spatial information
between adjacent words, Gated-CNN is introduced to fuse the information of
adjacent words. Besides, a specific Adversarial training method is proposed to
deal with the overfitting problem in NER. We add perturbation to variables in
the network during the training process, making the variables more diverse,
improving the generalization and robustness of the model. Our model is
evaluated on three benchmarks, CoNLL-03, OntoNotes 5.0, and WNUT-17, achieving
state-of-the-art results. Ablation study and case study also show that our
system can converge faster and is less prone to overfitting.
Related papers
- Supervised Gradual Machine Learning for Aspect Category Detection [0.9857683394266679]
Aspect Category Detection (ACD) aims to identify implicit and explicit aspects in a given review sentence.
We propose a novel approach to tackle the ACD task by combining Deep Neural Networks (DNNs) with Gradual Machine Learning (GML) in a supervised setting.
arXiv Detail & Related papers (2024-04-08T07:21:46Z) - In-Context Learning for Few-Shot Nested Named Entity Recognition [53.55310639969833]
We introduce an effective and innovative ICL framework for the setting of few-shot nested NER.
We improve the ICL prompt by devising a novel example demonstration selection mechanism, EnDe retriever.
In EnDe retriever, we employ contrastive learning to perform three types of representation learning, in terms of semantic similarity, boundary similarity, and label similarity.
arXiv Detail & Related papers (2024-02-02T06:57:53Z) - Convolutional Dictionary Learning by End-To-End Training of Iterative
Neural Networks [3.6280929178575994]
In this work, we construct an INN which can be used as a supervised and physics-informed online convolutional dictionary learning algorithm.
We show that the proposed INN improves over two conventional model-agnostic training methods and yields competitive results also compared to a deep INN.
arXiv Detail & Related papers (2022-06-09T12:15:38Z) - Nested Named Entity Recognition as Holistic Structure Parsing [92.8397338250383]
This work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all.
Experiments show that our model yields promising results on widely-used benchmarks which approach or even achieve state-of-the-art.
arXiv Detail & Related papers (2022-04-17T12:48:20Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - Empirical Study of Named Entity Recognition Performance Using
Distribution-aware Word Embedding [15.955385058787348]
We develop a distribution-aware word embedding and implement three different methods to make use of the distribution information in a NER framework.
The performance of NER will be improved if the word specificity is incorporated into existing NER methods.
arXiv Detail & Related papers (2021-09-03T17:28:04Z) - Self-Supervised Learning of Event-Based Optical Flow with Spiking Neural
Networks [3.7384509727711923]
A major challenge for neuromorphic computing is that learning algorithms for traditional artificial neural networks (ANNs) do not transfer directly to spiking neural networks (SNNs)
In this article, we focus on the self-supervised learning problem of optical flow estimation from event-based camera inputs.
We show that the performance of the proposed ANNs and SNNs are on par with that of the current state-of-the-art ANNs trained in a self-supervised manner.
arXiv Detail & Related papers (2021-06-03T14:03:41Z) - A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task.
The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z) - Deep Time Delay Neural Network for Speech Enhancement with Full Data
Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Exploring Pre-training with Alignments for RNN Transducer based
End-to-End Speech Recognition [39.497407288772386]
recurrent neural network transducer (RNN-T) architecture has become an emerging trend in end-to-end automatic speech recognition research.
In this work, we leverage external alignments to seed the RNN-T model.
Two different pre-training solutions are explored, referred to as encoder pre-training, and whole-network pre-training respectively.
arXiv Detail & Related papers (2020-05-01T19:00:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.