Better Feature Integration for Named Entity Recognition
- URL: http://arxiv.org/abs/2104.05316v1
- Date: Mon, 12 Apr 2021 09:55:06 GMT
- Title: Better Feature Integration for Named Entity Recognition
- Authors: Lu Xu, Zhanming Jie, Wei Lu and Lidong Bing
- Abstract summary: We propose a simple and robust solution to incorporate both types of features with our Synergized-LSTM (Syn-LSTM)
The results demonstrate that the proposed model achieves better performance than previous approaches while requiring fewer parameters.
- Score: 30.676768644145
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: It has been shown that named entity recognition (NER) could benefit from
incorporating the long-distance structured information captured by dependency
trees. We believe this is because both types of features - the contextual
information captured by the linear sequences and the structured information
captured by the dependency trees may complement each other. However, existing
approaches largely focused on stacking the LSTM and graph neural networks such
as graph convolutional networks (GCNs) for building improved NER models, where
the exact interaction mechanism between the two types of features is not very
clear, and the performance gain does not appear to be significant. In this
work, we propose a simple and robust solution to incorporate both types of
features with our Synergized-LSTM (Syn-LSTM), which clearly captures how the
two types of features interact. We conduct extensive experiments on several
standard datasets across four languages. The results demonstrate that the
proposed model achieves better performance than previous approaches while
requiring fewer parameters. Our further analysis demonstrates that our model
can capture longer dependencies compared with strong baselines.
Related papers
- A Functional Extension of Semi-Structured Networks [2.482050942288848]
Semi-structured networks (SSNs) merge structures familiar from additive models with deep neural networks.
Inspired by large-scale datasets, this paper explores extending SSNs to functional data.
We propose a functional SSN method that retains the advantageous properties of classical functional regression approaches while also improving scalability.
arXiv Detail & Related papers (2024-10-07T18:50:18Z) - Latent Semantic Consensus For Deterministic Geometric Model Fitting [109.44565542031384]
We propose an effective method called Latent Semantic Consensus (LSC)
LSC formulates the model fitting problem into two latent semantic spaces based on data points and model hypotheses.
LSC is able to provide consistent and reliable solutions within only a few milliseconds for general multi-structural model fitting.
arXiv Detail & Related papers (2024-03-11T05:35:38Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Exploring Multimodal Sentiment Analysis via CBAM Attention and
Double-layer BiLSTM Architecture [3.9850392954445875]
In our model, we use BERT + BiLSTM as new feature extractor to capture the long-distance dependencies in sentences.
To remove redundant information, CNN and CBAM attention are added after splicing text features and picture features.
The experimental results show that our model achieves a sound effect, similar to the advanced model.
arXiv Detail & Related papers (2023-03-26T12:34:01Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product
Retrieval [152.3504607706575]
This research aims to conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories.
We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks.
We exploit to train a more effective cross-modal model which is adaptively capable of incorporating key concept information from the multi-modal data.
arXiv Detail & Related papers (2022-06-17T15:40:45Z) - Simplifying approach to Node Classification in Graph Neural Networks [7.057970273958933]
We decouple the node feature aggregation step and depth of graph neural network, and empirically analyze how different aggregated features play a role in prediction performance.
We show that not all features generated via aggregation steps are useful, and often using these less informative features can be detrimental to the performance of the GNN model.
We present a simple and shallow model, Feature Selection Graph Neural Network (FSGNN), and show empirically that the proposed model achieves comparable or even higher accuracy than state-of-the-art GNN models.
arXiv Detail & Related papers (2021-11-12T14:53:22Z) - ARM-Net: Adaptive Relation Modeling Network for Structured Data [29.94433633729326]
ARM-Net is an adaptive relation modeling network tailored for structured data and a lightweight framework ARMOR based on ARM-Net for relational data.
We show that ARM-Net consistently outperforms existing models and provides more interpretable predictions for datasets.
arXiv Detail & Related papers (2021-07-05T07:37:24Z) - Representations of Syntax [MASK] Useful: Effects of Constituency and
Dependency Structure in Recursive LSTMs [26.983602540576275]
Sequence-based neural networks show significant sensitivity to syntactic structure, but they still perform less well on syntactic tasks than tree-based networks.
We evaluate which of these two representational schemes more effectively introduces biases for syntactic structure.
We show that a constituency-based network generalizes more robustly than a dependency-based one, and that combining the two types of structure does not yield further improvement.
arXiv Detail & Related papers (2020-04-30T18:00:06Z) - A Dependency Syntactic Knowledge Augmented Interactive Architecture for
End-to-End Aspect-based Sentiment Analysis [73.74885246830611]
We propose a novel dependency syntactic knowledge augmented interactive architecture with multi-task learning for end-to-end ABSA.
This model is capable of fully exploiting the syntactic knowledge (dependency relations and types) by leveraging a well-designed Dependency Relation Embedded Graph Convolutional Network (DreGcn)
Extensive experimental results on three benchmark datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-04T14:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.