Supplementary Features of BiLSTM for Enhanced Sequence Labeling
- URL: http://arxiv.org/abs/2305.19928v4
- Date: Fri, 23 Jun 2023 13:47:17 GMT
- Title: Supplementary Features of BiLSTM for Enhanced Sequence Labeling
- Authors: Conglei Xu, Kun Shen, Hongguang Sun
- Abstract summary: The capacity of BiLSTM to produce sentence representations for sequence labeling tasks is inherently limited.
We devised a global context mechanism to integrate entire future and past sentence representations into each cell's sentence representation.
We noted significant improvements in F1 scores and accuracy across all examined datasets.
- Score: 1.6255202259274413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sequence labeling tasks require the computation of sentence representations
for each word within a given sentence. A prevalent method incorporates a
Bi-directional Long Short-Term Memory (BiLSTM) layer to enhance the sequence
structure information. However, empirical evidence Li (2020) suggests that the
capacity of BiLSTM to produce sentence representations for sequence labeling
tasks is inherently limited. This limitation primarily results from the
integration of fragments from past and future sentence representations to
formulate a complete sentence representation. In this study, we observed that
the entire sentence representation, found in both the first and last cells of
BiLSTM, can supplement each the individual sentence representation of each
cell. Accordingly, we devised a global context mechanism to integrate entire
future and past sentence representations into each cell's sentence
representation within the BiLSTM framework. By incorporating the BERT model
within BiLSTM as a demonstration, and conducting exhaustive experiments on nine
datasets for sequence labeling tasks, including named entity recognition (NER),
part of speech (POS) tagging, and End-to-End Aspect-Based sentiment analysis
(E2E-ABSA). We noted significant improvements in F1 scores and accuracy across
all examined datasets.
Related papers
- Hyperbolic sentence representations for solving Textual Entailment [0.0]
We use the Poincare ball to embed sentences with the goal of proving how hyperbolic spaces can be used for solving Textual Entailment.
We evaluate against baselines of various backgrounds, including LSTMs, Order Embeddings and Euclidean Averaging.
We consistently outperform the baselines on the SICK dataset and are second only to Order Embeddings on the SNLI dataset.
arXiv Detail & Related papers (2024-06-15T15:39:43Z) - Neural Machine Translation with Contrastive Translation Memories [71.86990102704311]
Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios.
We propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence.
In training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence.
arXiv Detail & Related papers (2022-12-06T17:10:17Z) - Text Summarization with Oracle Expectation [88.39032981994535]
Extractive summarization produces summaries by identifying and concatenating the most important sentences in a document.
Most summarization datasets do not come with gold labels indicating whether document sentences are summary-worthy.
We propose a simple yet effective labeling algorithm that creates soft, expectation-based sentence labels.
arXiv Detail & Related papers (2022-09-26T14:10:08Z) - Reformulating Sentence Ordering as Conditional Text Generation [17.91448517871621]
We present Reorder-BART (RE-BART), a sentence ordering framework.
We reformulate the task as a conditional text-to-marker generation setup.
Our framework achieves the state-of-the-art performance across six datasets in Perfect Match Ratio (PMR) and Kendall's tau ($tau$) metric.
arXiv Detail & Related papers (2021-04-14T18:16:47Z) - Syntactic representation learning for neural network based TTS with
syntactic parse tree traversal [49.05471750563229]
We propose a syntactic representation learning method based on syntactic parse tree to automatically utilize the syntactic structure information.
Experimental results demonstrate the effectiveness of our proposed approach.
For sentences with multiple syntactic parse trees, prosodic differences can be clearly perceived from the synthesized speeches.
arXiv Detail & Related papers (2020-12-13T05:52:07Z) - BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling [6.196023076311228]
We propose a novel hierarchical visual storytelling framework which separately models sentence-level and word-level semantics.
We then employ a hierarchical LSTM network: the bottom LSTM receives as input the sentence vector representation from BERT, to learn the dependencies between the sentences corresponding to images, and the top LSTM is responsible for generating the corresponding word vector representations.
Experimental results demonstrate that our model outperforms most closely related baselines under automatic evaluation metrics BLEU and CIDEr.
arXiv Detail & Related papers (2020-12-03T18:07:28Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z) - Improving Bi-LSTM Performance for Indonesian Sentiment Analysis Using
Paragraph Vector [0.0]
Bidirectional Long Short-Term Memory Network (Bi-LSTM) has shown promising performance in sentiment classification task.
We propose the using of an existing document representation method called paragraph vector as additional input features for Bi-LSTM.
arXiv Detail & Related papers (2020-09-12T03:43:30Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z) - Depth-Adaptive Graph Recurrent Network for Text Classification [71.20237659479703]
Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network.
We propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required.
arXiv Detail & Related papers (2020-02-29T03:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.