Related papers: Supplementary Features of BiLSTM for Enhanced Sequence Labeling

Supplementary Features of BiLSTM for Enhanced Sequence Labeling

URL: http://arxiv.org/abs/2305.19928v4
Date: Fri, 23 Jun 2023 13:47:17 GMT
Title: Supplementary Features of BiLSTM for Enhanced Sequence Labeling
Authors: Conglei Xu, Kun Shen, Hongguang Sun
Abstract summary: The capacity of BiLSTM to produce sentence representations for sequence labeling tasks is inherently limited. We devised a global context mechanism to integrate entire future and past sentence representations into each cell's sentence representation. We noted significant improvements in F1 scores and accuracy across all examined datasets.
Score: 1.6255202259274413
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sequence labeling tasks require the computation of sentence representations for each word within a given sentence. A prevalent method incorporates a Bi-directional Long Short-Term Memory (BiLSTM) layer to enhance the sequence structure information. However, empirical evidence Li (2020) suggests that the capacity of BiLSTM to produce sentence representations for sequence labeling tasks is inherently limited. This limitation primarily results from the integration of fragments from past and future sentence representations to formulate a complete sentence representation. In this study, we observed that the entire sentence representation, found in both the first and last cells of BiLSTM, can supplement each the individual sentence representation of each cell. Accordingly, we devised a global context mechanism to integrate entire future and past sentence representations into each cell's sentence representation within the BiLSTM framework. By incorporating the BERT model within BiLSTM as a demonstration, and conducting exhaustive experiments on nine datasets for sequence labeling tasks, including named entity recognition (NER), part of speech (POS) tagging, and End-to-End Aspect-Based sentiment analysis (E2E-ABSA). We noted significant improvements in F1 scores and accuracy across all examined datasets.

Related papers

Hyperbolic sentence representations for solving Textual Entailment [0.0]
We use the Poincare ball to embed sentences with the goal of proving how hyperbolic spaces can be used for solving Textual Entailment. We evaluate against baselines of various backgrounds, including LSTMs, Order Embeddings and Euclidean Averaging. We consistently outperform the baselines on the SICK dataset and are second only to Order Embeddings on the SNLI dataset.
arXiv Detail & Related papers (2024-06-15T15:39:43Z)
Neural Machine Translation with Contrastive Translation Memories [71.86990102704311]
Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios. We propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence. In training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence.
arXiv Detail & Related papers (2022-12-06T17:10:17Z)
Text Summarization with Oracle Expectation [88.39032981994535]
Extractive summarization produces summaries by identifying and concatenating the most important sentences in a document. Most summarization datasets do not come with gold labels indicating whether document sentences are summary-worthy. We propose a simple yet effective labeling algorithm that creates soft, expectation-based sentence labels.
arXiv Detail & Related papers (2022-09-26T14:10:08Z)
Reformulating Sentence Ordering as Conditional Text Generation [17.91448517871621]
We present Reorder-BART (RE-BART), a sentence ordering framework. We reformulate the task as a conditional text-to-marker generation setup. Our framework achieves the state-of-the-art performance across six datasets in Perfect Match Ratio (PMR) and Kendall's tau ($tau$) metric.
arXiv Detail & Related papers (2021-04-14T18:16:47Z)
Syntactic representation learning for neural network based TTS with syntactic parse tree traversal [49.05471750563229]
We propose a syntactic representation learning method based on syntactic parse tree to automatically utilize the syntactic structure information. Experimental results demonstrate the effectiveness of our proposed approach. For sentences with multiple syntactic parse trees, prosodic differences can be clearly perceived from the synthesized speeches.
arXiv Detail & Related papers (2020-12-13T05:52:07Z)
BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling [6.196023076311228]
We propose a novel hierarchical visual storytelling framework which separately models sentence-level and word-level semantics. We then employ a hierarchical LSTM network: the bottom LSTM receives as input the sentence vector representation from BERT, to learn the dependencies between the sentences corresponding to images, and the top LSTM is responsible for generating the corresponding word vector representations. Experimental results demonstrate that our model outperforms most closely related baselines under automatic evaluation metrics BLEU and CIDEr.
arXiv Detail & Related papers (2020-12-03T18:07:28Z)
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing. We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar. To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z)
Improving Bi-LSTM Performance for Indonesian Sentiment Analysis Using Paragraph Vector [0.0]
Bidirectional Long Short-Term Memory Network (Bi-LSTM) has shown promising performance in sentiment classification task. We propose the using of an existing document representation method called paragraph vector as additional input features for Bi-LSTM.
arXiv Detail & Related papers (2020-09-12T03:43:30Z)
BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity. Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset. We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z)
Depth-Adaptive Graph Recurrent Network for Text Classification [71.20237659479703]
Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network. We propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required.
arXiv Detail & Related papers (2020-02-29T03:09:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.