Research on multi-dimensional end-to-end phrase recognition algorithm
based on background knowledge
- URL: http://arxiv.org/abs/2007.03860v1
- Date: Wed, 8 Jul 2020 02:30:00 GMT
- Title: Research on multi-dimensional end-to-end phrase recognition algorithm
based on background knowledge
- Authors: Zheng Li, Gang Tu, Guang Liu, Zhi-Qiang Zhan, Yi-Jian Liu
- Abstract summary: The experiment on CPWD dataset, by introducing background knowledge, the new algorithm improves the accuracy of the end-to-end method by more than one point.
The corresponding method was applied to the CCL 2018 competition and won the first place in the task of Chinese humor type recognition.
- Score: 4.020059842004492
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: At present, the deep end-to-end method based on supervised learning is used
in entity recognition and dependency analysis. There are two problems in this
method: firstly, background knowledge cannot be introduced; secondly, multi
granularity and nested features of natural language cannot be recognized. In
order to solve these problems, the annotation rules based on phrase window are
proposed, and the corresponding multi-dimensional end-to-end phrase recognition
algorithm is designed. This annotation rule divides sentences into seven types
of nested phrases, and indicates the dependency between phrases. The algorithm
can not only introduce background knowledge, recognize all kinds of nested
phrases in sentences, but also recognize the dependency between phrases. The
experimental results show that the annotation rule is easy to use and has no
ambiguity; the matching algorithm is more consistent with the multi granularity
and diversity characteristics of syntax than the traditional end-to-end
algorithm. The experiment on CPWD dataset, by introducing background knowledge,
the new algorithm improves the accuracy of the end-to-end method by more than
one point. The corresponding method was applied to the CCL 2018 competition and
won the first place in the task of Chinese humor type recognition.
Related papers
- Greed is All You Need: An Evaluation of Tokenizer Inference Methods [4.300681074103876]
We provide a controlled analysis of seven tokenizer inference methods across four different algorithms and three vocabulary sizes.
We show that for the most commonly used tokenizers, greedy inference performs surprisingly well; and that SaGe, a recently-introduced contextually-informed tokenizer, outperforms all others on morphological alignment.
arXiv Detail & Related papers (2024-03-02T19:01:40Z) - RankCSE: Unsupervised Sentence Representations Learning via Learning to
Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning.
It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework.
An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z) - Towards Unsupervised Recognition of Token-level Semantic Differences in
Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task.
We study three unsupervised approaches that rely on a masked language model.
Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z) - DEIM: An effective deep encoding and interaction model for sentence
matching [0.0]
We propose a sentence matching method based on deep encoding and interaction to extract deep semantic information.
In the encoder layer,we refer to the information of another sentence in the process of encoding a single sentence, and later use a algorithm to fuse the information.
In the interaction layer, we use a bidirectional attention mechanism and a self-attention mechanism to obtain deep semantic information.
arXiv Detail & Related papers (2022-03-20T07:59:42Z) - Improving End-to-End Contextual Speech Recognition with Fine-grained
Contextual Knowledge Selection [21.116123328330467]
This work focuses on mitigating confusion problems with fine-grained contextual knowledge selection (FineCoS)
We first apply phrase selection to narrow the range of phrase candidates, and then conduct token attention on the tokens in the selected phrase candidates.
We re-normalize the attention weights of most relevant phrases in inference to obtain more focused phrase-level contextual representations.
arXiv Detail & Related papers (2022-01-30T13:08:16Z) - Extracting Grammars from a Neural Network Parser for Anomaly Detection
in Unknown Formats [79.6676793507792]
Reinforcement learning has recently shown promise as a technique for training an artificial neural network to parse sentences in some unknown format.
This paper presents procedures for extracting production rules from the neural network, and for using these rules to determine whether a given sentence is nominal or anomalous.
arXiv Detail & Related papers (2021-07-30T23:10:24Z) - A Novel Word Sense Disambiguation Approach Using WordNet Knowledge Graph [0.0]
This paper presents a knowledge-based word sense disambiguation algorithm, namely Sequential Contextual Similarity Matrix multiplication (SCSMM)
The SCSMM algorithm combines semantic similarity, knowledge, and document context to respectively exploit the merits of local context.
The proposed algorithm outperformed all other algorithms when disambiguating nouns on the combined gold standard datasets.
arXiv Detail & Related papers (2021-01-08T06:47:32Z) - A Comparative Study on Structural and Semantic Properties of Sentence
Embeddings [77.34726150561087]
We propose a set of experiments using a widely-used large-scale data set for relation extraction.
We show that different embedding spaces have different degrees of strength for the structural and semantic properties.
These results provide useful information for developing embedding-based relation extraction methods.
arXiv Detail & Related papers (2020-09-23T15:45:32Z) - Tighter Generalization Bounds for Iterative Differentially Private
Learning Algorithms [95.73230376153872]
This paper studies the relationship between generalization and privacy preservation in iterative learning algorithms by two sequential steps.
We prove that $(varepsilon, delta)$-differential privacy implies an on-average generalization bound for multi-Database learning algorithms.
We then investigate how the iterative nature shared by most learning algorithms influence privacy preservation and further generalization.
arXiv Detail & Related papers (2020-07-18T09:12:03Z) - Research on Annotation Rules and Recognition Algorithm Based on Phrase
Window [4.334276223622026]
We propose labeling rules based on phrase windows, and designed corresponding phrase recognition algorithms.
The labeling rule uses phrases as the minimum unit, di-vides sentences into 7 types of nestable phrase types, and marks the grammatical dependencies between phrases.
The corresponding algorithm, drawing on the idea of identifying the target area in the image field, can find the start and end positions of various phrases in the sentence.
arXiv Detail & Related papers (2020-07-07T00:19:47Z) - Learning Coupled Policies for Simultaneous Machine Translation using
Imitation Learning [85.70547744787]
We present an approach to efficiently learn a simultaneous translation model with coupled programmer-interpreter policies.
Experiments on six language-pairs show our method outperforms strong baselines in terms of translation quality.
arXiv Detail & Related papers (2020-02-11T10:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.