Improving Joint Layer RNN based Keyphrase Extraction by Using
Syntactical Features
- URL: http://arxiv.org/abs/2009.07119v1
- Date: Tue, 15 Sep 2020 14:20:04 GMT
- Title: Improving Joint Layer RNN based Keyphrase Extraction by Using
Syntactical Features
- Authors: Miftahul Mahfuzh, Sidik Soleman, Ayu Purwarianti
- Abstract summary: We propose to modify the input layer of JRNN to extract more than one sequence of keywords.
Our method achieved.9597 in accuracy and.7691 in F1.
- Score: 0.6724914680904501
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Keyphrase extraction as a task to identify important words or phrases from a
text, is a crucial process to identify main topics when analyzing texts from a
social media platform. In our study, we focus on text written in Indonesia
language taken from Twitter. Different from the original joint layer recurrent
neural network (JRNN) with output of one sequence of keywords and using only
word embedding, here we propose to modify the input layer of JRNN to extract
more than one sequence of keywords by additional information of syntactical
features, namely part of speech, named entity types, and dependency structures.
Since JRNN in general requires a large amount of data as the training examples
and creating those examples is expensive, we used a data augmentation method to
increase the number of training examples. Our experiment had shown that our
method outperformed the baseline methods. Our method achieved .9597 in accuracy
and .7691 in F1.
Related papers
- Data Augmentation for Low-Resource Keyphrase Generation [46.52115499306222]
Keyphrase generation is the task of summarizing the contents of any given article into a few salient phrases (or keyphrases)
Existing works for the task mostly rely on large-scale annotated datasets, which are not easy to acquire.
We present data augmentation strategies specifically to address keyphrase generation in purely resource-constrained domains.
arXiv Detail & Related papers (2023-05-29T09:20:34Z) - Improving Keyphrase Extraction with Data Augmentation and Information
Filtering [67.43025048639333]
Keyphrase extraction is one of the essential tasks for document understanding in NLP.
We present a novel corpus and method for keyphrase extraction from the videos streamed on the Behance platform.
arXiv Detail & Related papers (2022-09-11T22:38:02Z) - Learning Rich Representation of Keyphrases from Text [12.698835743464313]
We show how to learn task-specific language models aimed towards learning rich representation of keyphrases from text documents.
In the discriminative setting, we introduce a new pre-training objective - Keyphrase Boundary Infilling with Replacement (KBIR)
In the generative setting, we introduce a new pre-training setup for BART - KeyBART, that reproduces the keyphrases related to the input text in the CatSeq format.
arXiv Detail & Related papers (2021-12-16T01:09:51Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Phraseformer: Multimodal Key-phrase Extraction using Transformer and
Graph Embedding [3.7110020502717616]
We develop a multimodal Key-phrase extraction approach, namely Phraseformer, using transformer and graph embedding techniques.
In Phraseformer, each keyword candidate is presented by a vector which is the concatenation of the text and structure learning representations.
We analyze the performance of Phraseformer on three datasets including Inspec, SemEval2010 and SemEval 2017 by F1-score.
arXiv Detail & Related papers (2021-06-09T09:32:17Z) - Semi-supervised URL Segmentation with Recurrent Neural Networks
Pre-trained on Knowledge Graph Entities [8.855143852360328]
We show effectiveness of a tagging model based on Recurrent Neural Networks (RNNs) using characters as input.
To compensate for the lack of training data, we propose a pre-training method on synthesisd entity names in a large knowledge database.
arXiv Detail & Related papers (2020-11-05T23:31:00Z) - Be More with Less: Hypergraph Attention Networks for Inductive Text
Classification [56.98218530073927]
Graph neural networks (GNNs) have received increasing attention in the research community and demonstrated their promising results on this canonical task.
Despite the success, their performance could be largely jeopardized in practice since they are unable to capture high-order interaction between words.
We propose a principled model -- hypergraph attention networks (HyperGAT) which can obtain more expressive power with less computational consumption for text representation learning.
arXiv Detail & Related papers (2020-11-01T00:21:59Z) - Keyphrase Extraction with Dynamic Graph Convolutional Networks and
Diversified Inference [50.768682650658384]
Keyphrase extraction (KE) aims to summarize a set of phrases that accurately express a concept or a topic covered in a given document.
Recent Sequence-to-Sequence (Seq2Seq) based generative framework is widely used in KE task, and it has obtained competitive performance on various benchmarks.
In this paper, we propose to adopt the Dynamic Graph Convolutional Networks (DGCN) to solve the above two problems simultaneously.
arXiv Detail & Related papers (2020-10-24T08:11:23Z) - Select, Extract and Generate: Neural Keyphrase Generation with
Layer-wise Coverage Attention [75.44523978180317]
We propose emphSEG-Net, a neural keyphrase generation model that is composed of two major components.
The experimental results on seven keyphrase generation benchmarks from scientific and web documents demonstrate that SEG-Net outperforms the state-of-the-art neural generative methods by a large margin.
arXiv Detail & Related papers (2020-08-04T18:00:07Z) - Keyphrase Extraction with Span-based Feature Representations [13.790461555410747]
Keyphrases are capable of providing semantic metadata characterizing documents.
Three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks.
In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens.
arXiv Detail & Related papers (2020-02-13T09:48:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.