Bidirectional LSTM-CRF Attention-based Model for Chinese Word
Segmentation
- URL: http://arxiv.org/abs/2105.09681v1
- Date: Thu, 20 May 2021 11:46:53 GMT
- Title: Bidirectional LSTM-CRF Attention-based Model for Chinese Word
Segmentation
- Authors: Chen Jin, Zhuangwei Shi, Weihua Li, Yanbu Guo
- Abstract summary: We propose a Bidirectional LSTM-CRF Attention-based Model for Chinese word segmentation.
Our model performs better than the baseline methods modeling by other neural networks.
- Score: 2.3991565023534087
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Chinese word segmentation (CWS) is the basic of Chinese natural language
processing (NLP). The quality of word segmentation will directly affect the
rest of NLP tasks. Recently, with the artificial intelligence tide rising
again, Long Short-Term Memory (LSTM) neural network, as one of easily modeling
in sequence, has been widely utilized in various kinds of NLP tasks, and
functions well. Attention mechanism is an ingenious method to solve the memory
compression problem on LSTM. Furthermore, inspired by the powerful abilities of
bidirectional LSTM models for modeling sequence and CRF model for decoding, we
propose a Bidirectional LSTM-CRF Attention-based Model in this paper.
Experiments on PKU and MSRA benchmark datasets show that our model performs
better than the baseline methods modeling by other neural networks.
Related papers
- BiLSTM and Attention-Based Modulation Classification of Realistic Wireless Signals [2.0650230600617534]
The proposed model exploits multiple representations of the wireless signal as inputs to the network.
An attention layer is used after the BiLSTM layer to emphasize the important temporal features.
The experimental results on the recent and realistic RML22 dataset demonstrate the superior performance of the proposed model with an accuracy up to around 99%.
arXiv Detail & Related papers (2024-08-14T01:17:19Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL)
We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z) - Bayesian Neural Network Language Modeling for Speech Recognition [59.681758762712754]
State-of-the-art neural network language models (NNLMs) represented by long short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming highly complex.
In this paper, an overarching full Bayesian learning framework is proposed to account for the underlying uncertainty in LSTM-RNN and Transformer LMs.
arXiv Detail & Related papers (2022-08-28T17:50:19Z) - Research on Dual Channel News Headline Classification Based on ERNIE
Pre-training Model [13.222137788045416]
The proposed model improves the accuracy, precision and F1-score of news headline classification compared with the traditional neural network model.
It can perform well in the multi-classification application of news headline text under large data volume.
arXiv Detail & Related papers (2022-02-14T10:44:12Z) - Multi-Scale Semantics-Guided Neural Networks for Efficient
Skeleton-Based Human Action Recognition [140.18376685167857]
A simple yet effective multi-scale semantics-guided neural network is proposed for skeleton-based action recognition.
MS-SGN achieves the state-of-the-art performance on the NTU60, NTU120, and SYSU datasets.
arXiv Detail & Related papers (2021-11-07T03:50:50Z) - Compressing LSTM Networks by Matrix Product Operators [7.395226141345625]
Long Short Term Memory(LSTM) models are the building blocks of many state-of-the-art natural language processing(NLP) and speech enhancement(SE) algorithms.
Here we introduce the MPO decomposition, which describes the local correlation of quantum states in quantum many-body physics.
We propose a matrix product operator(MPO) based neural network architecture to replace the LSTM model.
arXiv Detail & Related papers (2020-12-22T11:50:06Z) - A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task.
The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z) - Sentiment Analysis Using Simplified Long Short-term Memory Recurrent
Neural Networks [1.5146765382501612]
We perform sentiment analysis on a GOP Debate Twitter dataset.
To speed up training and reduce the computational cost and time, six different parameter reduced slim versions of the LSTM model are proposed.
arXiv Detail & Related papers (2020-05-08T12:50:10Z) - Depth-Adaptive Graph Recurrent Network for Text Classification [71.20237659479703]
Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network.
We propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required.
arXiv Detail & Related papers (2020-02-29T03:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.