Method of noun phrase detection in Ukrainian texts
- URL: http://arxiv.org/abs/2010.11548v1
- Date: Thu, 22 Oct 2020 09:20:24 GMT
- Title: Method of noun phrase detection in Ukrainian texts
- Authors: S.D. Pogorilyy, A.A. Kramov
- Abstract summary: The investigation of the search for noun phrases within Ukrainian texts are still at an early stage.
The complex method of noun phrases detection in Ukrainian texts utilizing Universal Dependencies means and named-entity recognition model has been suggested.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Introduction. The area of natural language processing considers AI-complete
tasks that cannot be solved using traditional algorithmic actions. Such tasks
are commonly implemented with the usage of machine learning methodology and
means of computer linguistics. One of the preprocessing tasks of a text is the
search of noun phrases. The accuracy of this task has implications for the
effectiveness of many other tasks in the area of natural language processing.
In spite of the active development of research in the area of natural language
processing, the investigation of the search for noun phrases within Ukrainian
texts are still at an early stage. Results. The different methods of noun
phrases detection have been analyzed. The expediency of the representation of
sentences as a tree structure has been justified. The key disadvantage of many
methods of noun phrase detection is the severe dependence of the effectiveness
of their detection from the features of a certain language. Taking into account
the unified format of sentence processing and the availability of the trained
model for the building of sentence trees for Ukrainian texts, the Universal
Dependency model has been chosen. The complex method of noun phrases detection
in Ukrainian texts utilizing Universal Dependencies means and named-entity
recognition model has been suggested. Experimental verification of the
effectiveness of the suggested method on the corpus of Ukrainian news has been
performed. Different metrics of method accuracy have been calculated.
Conclusions. The results obtained can indicate that the suggested method can be
used to find noun phrases in Ukrainian texts. An accuracy increase of the
method can be made with the usage of appropriate named-entity recognition
models according to a subject area.
Related papers
- Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text [61.22649031769564]
We propose a novel framework, paraphrased text span detection (PTD)
PTD aims to identify paraphrased text spans within a text.
We construct a dedicated dataset, PASTED, for paraphrased text span detection.
arXiv Detail & Related papers (2024-05-21T11:22:27Z) - Design and Implementation of a Tool for Extracting Uzbek Syllables [0.0]
Syllabification is a versatile linguistic tool with applications in linguistic research, language technology, education, and various fields.
We present a comprehensive approach to syllabification for the Uzbek language, including rule-based techniques and machine learning algorithms.
The results of our experiments show that both approaches achieved a high level of accuracy, exceeding 99%.
arXiv Detail & Related papers (2023-12-25T17:46:58Z) - An Adversarial Multi-Task Learning Method for Chinese Text Correction
with Semantic Detection [0.0]
adversarial multi-task learning method is proposed to enhance the modeling and detection ability of character polysemy in Chinese sentence context.
Monte Carlo tree search strategy and a policy network are introduced to accomplish the efficient Chinese text correction task with semantic detection.
arXiv Detail & Related papers (2023-06-28T15:46:00Z) - Textual Entailment Recognition with Semantic Features from Empirical
Text Representation [60.31047947815282]
A text entails a hypothesis if and only if the true value of the hypothesis follows the text.
In this paper, we propose a novel approach to identifying the textual entailment relationship between text and hypothesis.
We employ an element-wise Manhattan distance vector-based feature that can identify the semantic entailment relationship between the text-hypothesis pair.
arXiv Detail & Related papers (2022-10-18T10:03:51Z) - On Decoding Strategies for Neural Text Generators [73.48162198041884]
We study the interaction between language generation tasks and decoding strategies.
We measure changes in attributes of generated text as a function of both decoding strategy and task.
Our results reveal both previously-observed and surprising findings.
arXiv Detail & Related papers (2022-03-29T16:25:30Z) - DEIM: An effective deep encoding and interaction model for sentence
matching [0.0]
We propose a sentence matching method based on deep encoding and interaction to extract deep semantic information.
In the encoder layer,we refer to the information of another sentence in the process of encoding a single sentence, and later use a algorithm to fuse the information.
In the interaction layer, we use a bidirectional attention mechanism and a self-attention mechanism to obtain deep semantic information.
arXiv Detail & Related papers (2022-03-20T07:59:42Z) - Method of the coherence evaluation of Ukrainian text [0.0]
Methods for text coherence measurements for Ukrainian language are analyzed.
Training and examination procedures are made on the corpus of Ukrainian texts.
Test procedure is implemented by performing of two typical tasks for the text coherence assessment.
arXiv Detail & Related papers (2020-10-31T16:48:55Z) - Curious Case of Language Generation Evaluation Metrics: A Cautionary
Tale [52.663117551150954]
A few popular metrics remain as the de facto metrics to evaluate tasks such as image captioning and machine translation.
This is partly due to ease of use, and partly because researchers expect to see them and know how to interpret them.
In this paper, we urge the community for more careful consideration of how they automatically evaluate their models.
arXiv Detail & Related papers (2020-10-26T13:57:20Z) - Unsupervised Text Generation by Learning from Search [86.51619839836331]
TGLS is a novel framework to unsupervised Text Generation by Learning.
We demonstrate the effectiveness of TGLS on two real-world natural language generation tasks, paraphrase generation and text formalization.
arXiv Detail & Related papers (2020-07-09T04:34:48Z) - On Vocabulary Reliance in Scene Text Recognition [79.21737876442253]
Methods perform well on images with words within vocabulary but generalize poorly to images with words outside vocabulary.
We call this phenomenon "vocabulary reliance"
We propose a simple yet effective mutual learning strategy to allow models of two families to learn collaboratively.
arXiv Detail & Related papers (2020-05-08T11:16:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.