Keyword Extraction in Scientific Documents
- URL: http://arxiv.org/abs/2207.01888v2
- Date: Thu, 7 Jul 2022 08:39:49 GMT
- Title: Keyword Extraction in Scientific Documents
- Authors: Susie Xi Rao, Piriyakorn Piriyatamwong, Parijat Ghoshal, Sara
Nasirian, Emmanuel de Salis, Sandra Mitrovi\'c, Michael Wechner, Vanya
Brucker, Peter Egger and Ce Zhang
- Abstract summary: Understanding scientific documents is an important step in downstream tasks such as knowledge graph building, text mining, and discipline classification.
In this workshop, we provide a better understanding of keyword and keyphrase extraction from the abstract of scientific publications.
- Score: 6.88201646115184
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The scientific publication output grows exponentially. Therefore, it is
increasingly challenging to keep track of trends and changes. Understanding
scientific documents is an important step in downstream tasks such as knowledge
graph building, text mining, and discipline classification. In this workshop,
we provide a better understanding of keyword and keyphrase extraction from the
abstract of scientific publications.
Related papers
- Enhancing Scientific Figure Captioning Through Cross-modal Learning [0.0]
The volume and diversity of scientific research data have surged, leading to an increase in the number and variety of charts.
This paper presents a novel approach to scientific chart title generation, demonstrating its effectiveness in improving the clarity and accessibility of research data.
arXiv Detail & Related papers (2024-06-24T18:08:19Z) - SciDMT: A Large-Scale Corpus for Detecting Scientific Mentions [52.35520385083425]
We present SciDMT, an enhanced and expanded corpus for scientific mention detection.
The corpus consists of two components: 1) the SciDMT main corpus, which includes 48 thousand scientific articles with over 1.8 million weakly annotated mention annotations in the format of in-text span, and 2) an evaluation set, which comprises 100 scientific articles manually annotated for evaluation purposes.
arXiv Detail & Related papers (2024-06-20T22:03:21Z) - The Semantic Scholar Open Data Platform [79.4493235243312]
Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature.
We combine public and proprietary data sources using state-of-the-art techniques for scholarly PDF content extraction and automatic knowledge graph construction.
The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings.
arXiv Detail & Related papers (2023-01-24T17:13:08Z) - Modeling Information Change in Science Communication with Semantically
Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change.
SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers.
Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z) - A Computational Inflection for Scientific Discovery [48.176406062568674]
We stand at the foot of a significant inflection in the trajectory of scientific discovery.
As society continues on its fast-paced digital transformation, so does humankind's collective scientific knowledge.
Computer science is poised to ignite a revolution in the scientific process itself.
arXiv Detail & Related papers (2022-05-04T11:36:54Z) - Change Summarization of Diachronic Scholarly Paper Collections by
Semantic Evolution Analysis [10.554831859741851]
We demonstrate a novel approach to analyze the collections of research papers published over longer time periods.
Our approach is based on comparing word semantic representations over time and aims to support users in a better understanding of large domain-focused archives of scholarly publications.
arXiv Detail & Related papers (2021-12-07T11:15:19Z) - CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction [89.33938657493765]
We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
arXiv Detail & Related papers (2021-06-03T03:00:12Z) - Semantic Analysis for Automated Evaluation of the Potential Impact of
Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory.
We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus.
We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z) - Generating Knowledge Graphs by Employing Natural Language Processing and
Machine Learning Techniques within the Scholarly Domain [1.9004296236396943]
We present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications.
Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools.
We generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain.
arXiv Detail & Related papers (2020-10-28T08:31:40Z) - Attention: to Better Stand on the Shoulders of Giants [34.5017808610466]
This paper develops an attention mechanism for the long-term scientific impact prediction.
It validates the method based on a real large-scale citation data set.
arXiv Detail & Related papers (2020-05-27T00:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.