Related papers: End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF: A Reproducibility Study

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF: A Reproducibility Study

URL: http://arxiv.org/abs/2510.10936v1
Date: Mon, 13 Oct 2025 02:49:21 GMT
Title: End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF: A Reproducibility Study
Authors: Anirudh Ganesh, Jayavardhan Reddy,
Abstract summary: We present a study of the state-of-the-art neural architecture for sequence labeling proposed by Ma and Hovycitemaend.<n>The original BiLSTM-CNN-CRF model combines character-level representations via Convolutional Neural Networks (CNNs), word-level context modeling through BiLSTMs, and structured prediction using Conditional Random Fields (CRFs)<n>Our implementation successfully reproduces the key results, achieving 91.18% F1-score on CoNLL-2003 NER and demonstrating the model's effectiveness across sequence labeling tasks.
Score: 1.7188280334580195
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a reproducibility study of the state-of-the-art neural architecture for sequence labeling proposed by Ma and Hovy (2016)\cite{ma2016end}. The original BiLSTM-CNN-CRF model combines character-level representations via Convolutional Neural Networks (CNNs), word-level context modeling through Bi-directional Long Short-Term Memory networks (BiLSTMs), and structured prediction using Conditional Random Fields (CRFs). This end-to-end approach eliminates the need for hand-crafted features while achieving excellent performance on named entity recognition (NER) and part-of-speech (POS) tagging tasks. Our implementation successfully reproduces the key results, achieving 91.18\% F1-score on CoNLL-2003 NER and demonstrating the model's effectiveness across sequence labeling tasks. We provide a detailed analysis of the architecture components and release an open-source PyTorch implementation to facilitate further research.

Related papers

Pointer Networks with Q-Learning for Combinatorial Optimization [55.2480439325792]
We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets) Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
arXiv Detail & Related papers (2023-11-05T12:03:58Z)
Spintronics for image recognition: performance benchmarking via ultrafast data-driven simulations [4.2412715094420665]
We present a demonstration of image classification using an echo-state network (ESN) relying on a single simulated spintronic nanostructure. We employ an ultrafast data-driven simulation framework called the data-driven Thiele equation approach to simulate the STVO dynamics. We showcase the versatility of our solution by successfully applying it to solve classification challenges with the MNIST, EMNIST-letters and Fashion MNIST datasets.
arXiv Detail & Related papers (2023-08-10T18:09:44Z)
GNN-SL: Sequence Labeling Based on Nearest Examples via GNN [50.55076156520809]
We introduce graph neural networks sequence labeling (GNN-SL) GNN-SL augments vanilla sequence labeling model output with similar tagging examples retrieved from the whole training set. We conduct a variety of experiments on three typical sequence labeling tasks. GNN-SL achieves results of 96.9 (+0.2) on PKU, 98.3 (+0.4) on CITYU, 98.5 (+0.2) on MSR, and 96.9 (+0.2) on AS for the CWS task.
arXiv Detail & Related papers (2022-12-05T04:22:00Z)
Neural Structured Prediction for Inductive Node Classification [29.908759584092167]
This paper studies node classification in the inductive setting, aiming to learn a model on labeled training graphs and generalize it to infer node labels on unlabeled test graphs. We present a new approach called the Structured Proxy Network (SPN), which combines the advantages of both worlds.
arXiv Detail & Related papers (2022-04-15T15:50:27Z)
Sequence Transduction with Graph-based Supervision [96.04967815520193]
We present a new transducer objective function that generalizes the RNN-T loss to accept a graph representation of the labels. We demonstrate that transducer-based ASR with CTC-like lattice achieves better results compared to standard RNN-T.
arXiv Detail & Related papers (2021-11-01T21:51:42Z)
Logsig-RNN: a novel network for robust and efficient skeleton-based action recognition [3.775860173040509]
We propose a novel module, namely Logsig-RNN, which is the combination of the log-native layer and recurrent type neural networks (RNNs) In particular, we achieve the state-of-the-art accuracy on Chalearn2013 gesture data by combining simple path transformation layers with the Logsig-RNN.
arXiv Detail & Related papers (2021-10-25T14:47:15Z)
Bidirectional LSTM-CRF Attention-based Model for Chinese Word Segmentation [2.3991565023534087]
We propose a Bidirectional LSTM-CRF Attention-based Model for Chinese word segmentation. Our model performs better than the baseline methods modeling by other neural networks.
arXiv Detail & Related papers (2021-05-20T11:46:53Z)
PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context. We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z)
Introducing the Hidden Neural Markov Chain framework [7.85426761612795]
This paper proposes the original Hidden Neural Markov Chain (HNMC) framework, a new family of sequential neural models. We propose three different models: the classic HNMC, the HNMC2, and the HNMC-CN. It shows this new neural sequential framework's potential, which can open the way to new models and might eventually compete with the prevalent BiLSTM and BiGRU.
arXiv Detail & Related papers (2021-02-17T20:13:45Z)
Regularizing Recurrent Neural Networks via Sequence Mixup [7.036759195546171]
We extend a class of celebrated regularization techniques originally proposed for feed-forward neural networks. Our proposed methods are easy to implement complexity, while leverage the performance of simple neural architectures.
arXiv Detail & Related papers (2020-11-27T05:43:40Z)
An Investigation of Potential Function Designs for Neural CRF [75.79555356970344]
In this paper, we investigate a series of increasingly expressive potential functions for neural CRF models. Our experiments show that the decomposed quadrilinear potential function based on the vector representations of two neighboring labels and two neighboring words consistently achieves the best performance.
arXiv Detail & Related papers (2020-11-11T07:32:18Z)
Self-Challenging Improves Cross-Domain Generalization [81.99554996975372]
Convolutional Neural Networks (CNN) conduct image classification by activating dominant features that correlated with labels. We introduce a simple training, Self-Challenging Representation (RSC), that significantly improves the generalization of CNN to the out-of-domain data. RSC iteratively challenges the dominant features activated on the training data, and forces the network to activate remaining features that correlates with labels.
arXiv Detail & Related papers (2020-07-05T21:42:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.