Supplementary Features of BiLSTM for Enhanced Sequence Labeling
- URL: http://arxiv.org/abs/2305.19928v4
- Date: Fri, 23 Jun 2023 13:47:17 GMT
- Title: Supplementary Features of BiLSTM for Enhanced Sequence Labeling
- Authors: Conglei Xu, Kun Shen, Hongguang Sun
- Abstract summary: The capacity of BiLSTM to produce sentence representations for sequence labeling tasks is inherently limited.
We devised a global context mechanism to integrate entire future and past sentence representations into each cell's sentence representation.
We noted significant improvements in F1 scores and accuracy across all examined datasets.
- Score: 1.6255202259274413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sequence labeling tasks require the computation of sentence representations
for each word within a given sentence. A prevalent method incorporates a
Bi-directional Long Short-Term Memory (BiLSTM) layer to enhance the sequence
structure information. However, empirical evidence Li (2020) suggests that the
capacity of BiLSTM to produce sentence representations for sequence labeling
tasks is inherently limited. This limitation primarily results from the
integration of fragments from past and future sentence representations to
formulate a complete sentence representation. In this study, we observed that
the entire sentence representation, found in both the first and last cells of
BiLSTM, can supplement each the individual sentence representation of each
cell. Accordingly, we devised a global context mechanism to integrate entire
future and past sentence representations into each cell's sentence
representation within the BiLSTM framework. By incorporating the BERT model
within BiLSTM as a demonstration, and conducting exhaustive experiments on nine
datasets for sequence labeling tasks, including named entity recognition (NER),
part of speech (POS) tagging, and End-to-End Aspect-Based sentiment analysis
(E2E-ABSA). We noted significant improvements in F1 scores and accuracy across
all examined datasets.
Related papers
- Semantic-Aligned Learning with Collaborative Refinement for Unsupervised VI-ReID [82.12123628480371]
Unsupervised person re-identification (USL-VI-ReID) seeks to match pedestrian images of the same individual across different modalities without human annotations for model learning.<n>Previous methods unify pseudo-labels of cross-modality images through label association algorithms and then design contrastive learning framework for global feature learning.<n>We propose a Semantic-Aligned Learning with Collaborative Refinement (SALCR) framework, which builds up objective for specific fine-grained patterns emphasized by each modality.
arXiv Detail & Related papers (2025-04-27T13:58:12Z) - FewTopNER: Integrating Few-Shot Learning with Topic Modeling and Named Entity Recognition in a Multilingual Framework [0.0]
FewTopNER is a framework that integrates few-shot named entity recognition with topic-aware contextual modeling.<n> Empirical evaluations on multilingual benchmarks demonstrate FewTopNER significantly outperforms state-of-the-art few-shot NER models.
arXiv Detail & Related papers (2025-02-04T15:13:40Z) - Towards Generalizable Trajectory Prediction Using Dual-Level Representation Learning And Adaptive Prompting [107.4034346788744]
Existing vehicle trajectory prediction models struggle with generalizability, prediction uncertainties, and handling complex interactions.<n>We propose Perceiver with Register queries (PerReg+), a novel trajectory prediction framework that introduces: (1) Dual-Level Representation Learning via Self-Distillation (SD) and Masked Reconstruction (MR), capturing global context and fine-grained details; (2) Enhanced Multimodality using register-based queries and pretraining, eliminating the need for clustering and suppression; and (3) Adaptive Prompt Tuning during fine-tuning, freezing the main architecture and optimizing a small number of prompts for efficient adaptation.
arXiv Detail & Related papers (2025-01-08T20:11:09Z) - ORIGAMI: A generative transformer architecture for predictions from semi-structured data [3.5639148953570836]
ORIGAMI is a transformer-based architecture that processes nested key/value pairs.<n>By reformulating classification as next-token prediction, ORIGAMI naturally handles both single-label and multi-label tasks.
arXiv Detail & Related papers (2024-12-23T07:21:17Z) - Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition [57.97930719585095]
We introduce Part-aware Unified Representation between Language and Skeleton (PURLS) to explore visual-semantic alignment at both local and global scales.
Our approach is evaluated on various skeleton/language backbones and three large-scale datasets.
The results showcase the universality and superior performance of PURLS, surpassing prior skeleton-based solutions and standard baselines from other domains.
arXiv Detail & Related papers (2024-06-19T08:22:32Z) - Hyperbolic sentence representations for solving Textual Entailment [0.0]
We use the Poincare ball to embed sentences with the goal of proving how hyperbolic spaces can be used for solving Textual Entailment.
We evaluate against baselines of various backgrounds, including LSTMs, Order Embeddings and Euclidean Averaging.
We consistently outperform the baselines on the SICK dataset and are second only to Order Embeddings on the SNLI dataset.
arXiv Detail & Related papers (2024-06-15T15:39:43Z) - CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner [41.001366870464636]
We propose to leverage text description generated from large language models to guide feature learning.
We first utilize the global text description to guide the skeleton encoder focus on informative joints.
We build non-local interaction between local text and joint features, to form the final global representation.
arXiv Detail & Related papers (2024-03-15T07:51:35Z) - USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text
Retrieval [115.28586222748478]
Image-Text Retrieval (ITR) aims at searching for the target instances that are semantically relevant to the given query from the other modality.
Existing approaches typically suffer from two major limitations.
arXiv Detail & Related papers (2023-01-17T12:42:58Z) - Neural Machine Translation with Contrastive Translation Memories [71.86990102704311]
Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios.
We propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence.
In training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence.
arXiv Detail & Related papers (2022-12-06T17:10:17Z) - Text Summarization with Oracle Expectation [88.39032981994535]
Extractive summarization produces summaries by identifying and concatenating the most important sentences in a document.
Most summarization datasets do not come with gold labels indicating whether document sentences are summary-worthy.
We propose a simple yet effective labeling algorithm that creates soft, expectation-based sentence labels.
arXiv Detail & Related papers (2022-09-26T14:10:08Z) - Hierarchical Local-Global Transformer for Temporal Sentence Grounding [58.247592985849124]
This paper studies the multimedia problem of temporal sentence grounding.
It aims to accurately determine the specific video segment in an untrimmed video according to a given sentence query.
arXiv Detail & Related papers (2022-08-31T14:16:56Z) - Exploiting Global Contextual Information for Document-level Named Entity
Recognition [46.99922251839363]
We propose a model called Global Context enhanced Document-level NER (GCDoc)
At word-level, a document graph is constructed to model a wider range of dependencies between words.
At sentence-level, for appropriately modeling wider context beyond single sentence, we employ a cross-sentence module.
Our model reaches F1 score of 92.22 (93.40 with BERT) on CoNLL 2003 dataset and 88.32 (90.49 with BERT) on Ontonotes 5.0 dataset.
arXiv Detail & Related papers (2021-06-02T01:52:07Z) - Reformulating Sentence Ordering as Conditional Text Generation [17.91448517871621]
We present Reorder-BART (RE-BART), a sentence ordering framework.
We reformulate the task as a conditional text-to-marker generation setup.
Our framework achieves the state-of-the-art performance across six datasets in Perfect Match Ratio (PMR) and Kendall's tau ($tau$) metric.
arXiv Detail & Related papers (2021-04-14T18:16:47Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Syntactic representation learning for neural network based TTS with
syntactic parse tree traversal [49.05471750563229]
We propose a syntactic representation learning method based on syntactic parse tree to automatically utilize the syntactic structure information.
Experimental results demonstrate the effectiveness of our proposed approach.
For sentences with multiple syntactic parse trees, prosodic differences can be clearly perceived from the synthesized speeches.
arXiv Detail & Related papers (2020-12-13T05:52:07Z) - BERT-hLSTMs: BERT and Hierarchical LSTMs for Visual Storytelling [6.196023076311228]
We propose a novel hierarchical visual storytelling framework which separately models sentence-level and word-level semantics.
We then employ a hierarchical LSTM network: the bottom LSTM receives as input the sentence vector representation from BERT, to learn the dependencies between the sentences corresponding to images, and the top LSTM is responsible for generating the corresponding word vector representations.
Experimental results demonstrate that our model outperforms most closely related baselines under automatic evaluation metrics BLEU and CIDEr.
arXiv Detail & Related papers (2020-12-03T18:07:28Z) - GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [117.98107557103877]
We present GraPPa, an effective pre-training approach for table semantic parsing.
We construct synthetic question-pairs over high-free tables via a synchronous context-free grammar.
To maintain the model's ability to represent real-world data, we also include masked language modeling.
arXiv Detail & Related papers (2020-09-29T08:17:58Z) - Improving Bi-LSTM Performance for Indonesian Sentiment Analysis Using
Paragraph Vector [0.0]
Bidirectional Long Short-Term Memory Network (Bi-LSTM) has shown promising performance in sentiment classification task.
We propose the using of an existing document representation method called paragraph vector as additional input features for Bi-LSTM.
arXiv Detail & Related papers (2020-09-12T03:43:30Z) - BURT: BERT-inspired Universal Representation from Twin Structure [89.82415322763475]
BURT (BERT inspired Universal Representation from Twin Structure) is capable of generating universal, fixed-size representations for input sequences of any granularity.
Our proposed BURT adopts the Siamese network, learning sentence-level representations from natural language inference dataset and word/phrase-level representations from paraphrasing dataset.
We evaluate BURT across different granularities of text similarity tasks, including STS tasks, SemEval2013 Task 5(a) and some commonly used word similarity tasks.
arXiv Detail & Related papers (2020-04-29T04:01:52Z) - Depth-Adaptive Graph Recurrent Network for Text Classification [71.20237659479703]
Sentence-State LSTM (S-LSTM) is a powerful and high efficient graph recurrent network.
We propose a depth-adaptive mechanism for the S-LSTM, which allows the model to learn how many computational steps to conduct for different words as required.
arXiv Detail & Related papers (2020-02-29T03:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.