FRAKE: Fusional Real-time Automatic Keyword Extraction
- URL: http://arxiv.org/abs/2104.04830v1
- Date: Sat, 10 Apr 2021 18:30:17 GMT
- Title: FRAKE: Fusional Real-time Automatic Keyword Extraction
- Authors: Aidin Zehtab-Salmasi, Mohammad-Reza Feizi-Derakhshi, Mohamad-Ali
Balafar
- Abstract summary: Keywords extraction is called identifying words or phrases that express the main concepts of texts in best.
We use a combined approach that consists of two models of graph centrality features and textural features.
- Score: 1.332091725929965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Keyword extraction is called identifying words or phrases that express the
main concepts of texts in best. There is a huge amount of texts that are
created every day and at all times through electronic infrastructure. So, it is
practically impossible for humans to study and manage this volume of documents.
However, the need for efficient and effective access to these documents is
evident in various purposes. Weblogs, News, and technical notes are almost long
texts, while the reader seeks to understand the concepts by topics or keywords
to decide for reading the full text. To this aim, we use a combined approach
that consists of two models of graph centrality features and textural features.
In the following, graph centralities, such as degree, betweenness, eigenvector,
and closeness centrality, have been used to optimally combine them to extract
the best keyword among the candidate keywords extracted by the proposed method.
Also, another approach has been introduced to distinguishing keywords among
candidate phrases and considering them as a separate keyword. To evaluate the
proposed method, seven datasets named, Semeval2010, SemEval2017, Inspec, fao30,
Thesis100, pak2018 and WikiNews have been used, and results reported Precision,
Recall, and F- measure.
Related papers
- ROUGE-K: Do Your Summaries Have Keywords? [11.393728547335217]
Keywords, that is, content-relevant words in summaries play an important role in efficient information conveyance.
Existing evaluation metrics for extreme summarization models do not pay explicit attention to keywords in summaries.
We propose four approaches for incorporating word importance into a transformer-based model.
arXiv Detail & Related papers (2024-03-08T09:54:56Z) - Unsupervised extraction of local and global keywords from a single text [0.0]
We propose an unsupervised, corpus-independent method to extract keywords from a single text.
It is based on the spatial distribution of words and the response of this distribution to a random permutation of words.
arXiv Detail & Related papers (2023-07-26T07:36:25Z) - Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening [53.1711708318581]
Current image-text retrieval methods suffer from $N$-related time complexity.
This paper presents a simple and effective keyword-guided pre-screening framework for the image-text retrieval.
arXiv Detail & Related papers (2023-03-14T09:36:42Z) - DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for
Open-world Detection [118.36746273425354]
This paper presents a paralleled visual-concept pre-training method for open-world detection by resorting to knowledge enrichment from a designed concept dictionary.
By enriching the concepts with their descriptions, we explicitly build the relationships among various concepts to facilitate the open-domain learning.
The proposed framework demonstrates strong zero-shot detection performances, e.g., on the LVIS dataset, our DetCLIP-T outperforms GLIP-T by 9.9% mAP and obtains a 13.5% improvement on rare categories.
arXiv Detail & Related papers (2022-09-20T02:01:01Z) - TRIE++: Towards End-to-End Information Extraction from Visually Rich
Documents [51.744527199305445]
This paper proposes a unified end-to-end information extraction framework from visually rich documents.
Text reading and information extraction can reinforce each other via a well-designed multi-modal context block.
The framework can be trained in an end-to-end trainable manner, achieving global optimization.
arXiv Detail & Related papers (2022-07-14T08:52:07Z) - LDKP: A Dataset for Identifying Keyphrases from Long Scientific
Documents [48.84086818702328]
Identifying keyphrases (KPs) from text documents is a fundamental task in natural language processing and information retrieval.
Vast majority of the benchmark datasets for this task are from the scientific domain containing only the document title and abstract information.
This presents three challenges for real-world applications: human-written summaries are unavailable for most documents, the documents are almost always long, and a high percentage of KPs are directly found beyond the limited context of title and abstract.
arXiv Detail & Related papers (2022-03-29T08:44:57Z) - Deep Keyphrase Completion [59.0413813332449]
Keyphrase provides accurate information of document content that is highly compact, concise, full of meanings, and widely used for discourse comprehension, organization, and text retrieval.
We propose textitkeyphrase completion (KPC) to generate more keyphrases for document (e.g. scientific publication) taking advantage of document content along with a very limited number of known keyphrases.
We name it textitdeep keyphrase completion (DKPC) since it attempts to capture the deep semantic meaning of the document content together with known keyphrases via a deep learning framework
arXiv Detail & Related papers (2021-10-29T07:15:35Z) - Phraseformer: Multimodal Key-phrase Extraction using Transformer and
Graph Embedding [3.7110020502717616]
We develop a multimodal Key-phrase extraction approach, namely Phraseformer, using transformer and graph embedding techniques.
In Phraseformer, each keyword candidate is presented by a vector which is the concatenation of the text and structure learning representations.
We analyze the performance of Phraseformer on three datasets including Inspec, SemEval2010 and SemEval 2017 by F1-score.
arXiv Detail & Related papers (2021-06-09T09:32:17Z) - A Joint Learning Approach based on Self-Distillation for Keyphrase
Extraction from Scientific Documents [29.479331909227998]
Keyphrase extraction is the task of extracting a small set of phrases that best describe a document.
Most existing benchmark datasets for the task typically have limited numbers of annotated documents.
We propose a simple and efficient joint learning approach based on the idea of self-distillation.
arXiv Detail & Related papers (2020-10-22T18:36:31Z) - Keywords lie far from the mean of all words in local vector space [5.040463208115642]
In this work, we follow a different path to detect the keywords from a text document by modeling the main distribution of the document's words using local word vector representations.
We confirm the high performance of our approach compared to strong baselines and state-of-the-art unsupervised keyword extraction methods.
arXiv Detail & Related papers (2020-08-21T14:42:33Z) - TRIE: End-to-End Text Reading and Information Extraction for Document
Understanding [56.1416883796342]
We propose a unified end-to-end text reading and information extraction network.
multimodal visual and textual features of text reading are fused for information extraction.
Our proposed method significantly outperforms the state-of-the-art methods in both efficiency and accuracy.
arXiv Detail & Related papers (2020-05-27T01:47:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.