CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction
- URL: http://arxiv.org/abs/2106.01560v1
- Date: Thu, 3 Jun 2021 03:00:12 GMT
- Title: CitationIE: Leveraging the Citation Graph for Scientific Information
Extraction
- Authors: Vijay Viswanathan, Graham Neubig, Pengfei Liu
- Abstract summary: We use the citation graph of referential links between citing and cited papers.
We observe a sizable improvement in end-to-end information extraction over the state-of-the-art.
- Score: 89.33938657493765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatically extracting key information from scientific documents has the
potential to help scientists work more efficiently and accelerate the pace of
scientific progress. Prior work has considered extracting document-level entity
clusters and relations end-to-end from raw scientific text, which can improve
literature search and help identify methods and materials for a given problem.
Despite the importance of this task, most existing works on scientific
information extraction (SciIE) consider extraction solely based on the content
of an individual paper, without considering the paper's place in the broader
literature. In contrast to prior work, we augment our text representations by
leveraging a complementary source of document context: the citation graph of
referential links between citing and cited papers. On a test set of
English-language scientific documents, we show that simple ways of utilizing
the structure and content of the citation graph can each lead to significant
gains in different scientific information extraction tasks. When these tasks
are combined, we observe a sizable improvement in end-to-end information
extraction over the state-of-the-art, suggesting the potential for future work
along this direction. We release software tools to facilitate citation-aware
SciIE development.
Related papers
- Context-Enhanced Language Models for Generating Multi-Paper Citations [35.80247519023821]
We propose a method that leverages Large Language Models (LLMs) to generate multi-citation sentences.
Our approach involves a single source paper and a collection of target papers, culminating in a coherent paragraph containing multi-sentence citation text.
arXiv Detail & Related papers (2024-04-22T04:30:36Z) - An approach based on Open Research Knowledge Graph for Knowledge
Acquisition from scientific papers [4.8951183832371]
Open Research Knowledge Graph (ORKG) is a computer-assisted tool to organize key-insights extracted from research papers.
It is currently used to document "food information engineering", "Tabular data to Knowledge Graph Matching" and "Question Answering" research problems and "Neuro-symbolic AI" domain.
arXiv Detail & Related papers (2023-08-23T20:05:42Z) - CiteBench: A benchmark for Scientific Citation Text Generation [69.37571393032026]
CiteBench is a benchmark for citation text generation.
We make the code for CiteBench publicly available at https://github.com/UKPLab/citebench.
arXiv Detail & Related papers (2022-12-19T16:10:56Z) - Scientific Paper Extractive Summarization Enhanced by Citation Graphs [50.19266650000948]
We focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings.
Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework.
Motivated by this, we propose a Graph-based Supervised Summarization model (GSS) to achieve more accurate results on the task when large-scale labeled data are available.
arXiv Detail & Related papers (2022-12-08T11:53:12Z) - Keyword Extraction in Scientific Documents [6.88201646115184]
Understanding scientific documents is an important step in downstream tasks such as knowledge graph building, text mining, and discipline classification.
In this workshop, we provide a better understanding of keyword and keyphrase extraction from the abstract of scientific publications.
arXiv Detail & Related papers (2022-07-05T08:33:47Z) - Semantic Analysis for Automated Evaluation of the Potential Impact of
Research Articles [62.997667081978825]
This paper presents a novel method for vector representation of text meaning based on information theory.
We show how this informational semantics is used for text classification on the basis of the Leicester Scientific Corpus.
We show that an informational approach to representing the meaning of a text has offered a way to effectively predict the scientific impact of research papers.
arXiv Detail & Related papers (2021-04-26T20:37:13Z) - Enhancing Scientific Papers Summarization with Citation Graph [78.65955304229863]
We redefine the task of scientific papers summarization by utilizing their citation graph.
We construct a novel scientific papers summarization dataset Semantic Scholar Network (SSN) which contains 141K research papers in different domains.
Our model can achieve competitive performance when compared with the pretrained models.
arXiv Detail & Related papers (2021-04-07T11:13:35Z) - Generating Knowledge Graphs by Employing Natural Language Processing and
Machine Learning Techniques within the Scholarly Domain [1.9004296236396943]
We present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications.
Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools.
We generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain.
arXiv Detail & Related papers (2020-10-28T08:31:40Z) - Explaining Relationships Between Scientific Documents [55.23390424044378]
We address the task of explaining relationships between two scientific documents using natural language text.
In this paper we establish a dataset of 622K examples from 154K documents.
arXiv Detail & Related papers (2020-02-02T03:54:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.