Related papers: ASTRAL: Adversarial Trained LSTM-CNN for Named Entity Recognition

ASTRAL: Adversarial Trained LSTM-CNN for Named Entity Recognition

URL: http://arxiv.org/abs/2009.01041v1
Date: Wed, 2 Sep 2020 13:15:25 GMT
Title: ASTRAL: Adversarial Trained LSTM-CNN for Named Entity Recognition
Authors: Jiuniu Wang, Wenjia Xu, Xingyu Fu, Guangluan Xu, Yirong Wu
Abstract summary: We propose an Adversarial Trained LSTM-CNN (ASTRAL) system to improve the current NER method from both the model structure and the training process. Our system is evaluated on three benchmarks, CoNLL-03, OntoNotes 5.0, and WNUT-17, achieving state-of-the-art results.
Score: 16.43239147870092
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Named Entity Recognition (NER) is a challenging task that extracts named entities from unstructured text data, including news, articles, social comments, etc. The NER system has been studied for decades. Recently, the development of Deep Neural Networks and the progress of pre-trained word embedding have become a driving force for NER. Under such circumstances, how to make full use of the information extracted by word embedding requires more in-depth research. In this paper, we propose an Adversarial Trained LSTM-CNN (ASTRAL) system to improve the current NER method from both the model structure and the training process. In order to make use of the spatial information between adjacent words, Gated-CNN is introduced to fuse the information of adjacent words. Besides, a specific Adversarial training method is proposed to deal with the overfitting problem in NER. We add perturbation to variables in the network during the training process, making the variables more diverse, improving the generalization and robustness of the model. Our model is evaluated on three benchmarks, CoNLL-03, OntoNotes 5.0, and WNUT-17, achieving state-of-the-art results. Ablation study and case study also show that our system can converge faster and is less prone to overfitting.

Related papers

Enhanced Temporal Processing in Spiking Neural Networks for Static Object Detection Using 3D Convolutions [0.0]
Spiking Neural Networks (SNNs) are a class of network models capable of processingtemporal information. This paper focuses on enhancing the SNNs unique ability to processtemporal information. To improve the SNN handling of temporal information, this paper proposes replacing traditional 2D convolutions with 3D convolutions.
arXiv Detail & Related papers (2024-12-23T15:32:26Z)
ALADE-SNN: Adaptive Logit Alignment in Dynamically Expandable Spiking Neural Networks for Class Incremental Learning [15.022211557367273]
We develop spiking neural networks (SNNs) with dynamic structures for Class Incremental Learning (CIL) We propose the ALADE-SNN framework, which includes adaptive logit alignment for balanced feature representation and OtoN suppression to manage weights mapping frozen old features to new classes during training. Experiment results show that ALADE-SNN achieves an average incremental accuracy of 75.42 on the CIFAR100-B0 benchmark over 10 incremental steps.
arXiv Detail & Related papers (2024-12-17T09:13:22Z)
Supervised Gradual Machine Learning for Aspect Category Detection [0.9857683394266679]
Aspect Category Detection (ACD) aims to identify implicit and explicit aspects in a given review sentence. We propose a novel approach to tackle the ACD task by combining Deep Neural Networks (DNNs) with Gradual Machine Learning (GML) in a supervised setting.
arXiv Detail & Related papers (2024-04-08T07:21:46Z)
In-Context Learning for Few-Shot Nested Named Entity Recognition [53.55310639969833]
We introduce an effective and innovative ICL framework for the setting of few-shot nested NER. We improve the ICL prompt by devising a novel example demonstration selection mechanism, EnDe retriever. In EnDe retriever, we employ contrastive learning to perform three types of representation learning, in terms of semantic similarity, boundary similarity, and label similarity.
arXiv Detail & Related papers (2024-02-02T06:57:53Z)
Convolutional Dictionary Learning by End-To-End Training of Iterative Neural Networks [3.6280929178575994]
In this work, we construct an INN which can be used as a supervised and physics-informed online convolutional dictionary learning algorithm. We show that the proposed INN improves over two conventional model-agnostic training methods and yields competitive results also compared to a deep INN.
arXiv Detail & Related papers (2022-06-09T12:15:38Z)
Nested Named Entity Recognition as Holistic Structure Parsing [92.8397338250383]
This work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all. Experiments show that our model yields promising results on widely-used benchmarks which approach or even achieve state-of-the-art.
arXiv Detail & Related papers (2022-04-17T12:48:20Z)
Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data. We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step. Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z)
Empirical Study of Named Entity Recognition Performance Using Distribution-aware Word Embedding [15.955385058787348]
We develop a distribution-aware word embedding and implement three different methods to make use of the distribution information in a NER framework. The performance of NER will be improved if the word specificity is incorporated into existing NER methods.
arXiv Detail & Related papers (2021-09-03T17:28:04Z)
Self-Supervised Learning of Event-Based Optical Flow with Spiking Neural Networks [3.7384509727711923]
A major challenge for neuromorphic computing is that learning algorithms for traditional artificial neural networks (ANNs) do not transfer directly to spiking neural networks (SNNs) In this article, we focus on the self-supervised learning problem of optical flow estimation from event-based camera inputs. We show that the performance of the proposed ANNs and SNNs are on par with that of the current state-of-the-art ANNs trained in a self-supervised manner.
arXiv Detail & Related papers (2021-06-03T14:03:41Z)
A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task. The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z)
Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning. To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition [39.497407288772386]
recurrent neural network transducer (RNN-T) architecture has become an emerging trend in end-to-end automatic speech recognition research. In this work, we leverage external alignments to seed the RNN-T model. Two different pre-training solutions are explored, referred to as encoder pre-training, and whole-network pre-training respectively.
arXiv Detail & Related papers (2020-05-01T19:00:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.