Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset
and Transformer-Based Results
- URL: http://arxiv.org/abs/2109.10453v1
- Date: Tue, 21 Sep 2021 22:54:09 GMT
- Title: Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset
and Transformer-Based Results
- Authors: Ian H. Magnusson and Scott E. Friedman
- Abstract summary: We build SciClaim, a dataset of scientific claims drawn from Social and Behavior Science (SBS), PubMed, and CORD-19 papers.
Our novel graph annotation schema incorporates not only coarse-grained entity spans as nodes and relations as edges between them, but also fine-grained attributes that modify entities and their relations.
By including more label types and more than twice the label density of previous datasets, SciClaim captures causal, comparative, predictive, statistical, and proportional associations over experimental variables along with their qualifications, subtypes, and evidence.
- Score: 0.5710971447109948
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent transformer-based approaches demonstrate promising results on
relational scientific information extraction. Existing datasets focus on
high-level description of how research is carried out. Instead we focus on the
subtleties of how experimental associations are presented by building SciClaim,
a dataset of scientific claims drawn from Social and Behavior Science (SBS),
PubMed, and CORD-19 papers. Our novel graph annotation schema incorporates not
only coarse-grained entity spans as nodes and relations as edges between them,
but also fine-grained attributes that modify entities and their relations, for
a total of 12,738 labels in the corpus. By including more label types and more
than twice the label density of previous datasets, SciClaim captures causal,
comparative, predictive, statistical, and proportional associations over
experimental variables along with their qualifications, subtypes, and evidence.
We extend work in transformer-based joint entity and relation extraction to
effectively infer our schema, showing the promise of fine-grained knowledge
graphs in scientific claims and beyond.
Related papers
- SciER: An Entity and Relation Extraction Dataset for Datasets, Methods, and Tasks in Scientific Documents [49.54155332262579]
We release a new entity and relation extraction dataset for entities related to datasets, methods, and tasks in scientific articles.
Our dataset contains 106 manually annotated full-text scientific publications with over 24k entities and 12k relations.
arXiv Detail & Related papers (2024-10-28T15:56:49Z) - The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges [101.83124435649358]
Homophily principle, ie nodes with the same labels or similar attributes are more likely to be connected.
Recent work has identified a non-trivial set of datasets where GNN's performance compared to the NN's is not satisfactory.
arXiv Detail & Related papers (2024-07-12T18:04:32Z) - Graph Relation Distillation for Efficient Biomedical Instance
Segmentation [80.51124447333493]
We propose a graph relation distillation approach for efficient biomedical instance segmentation.
We introduce two graph distillation schemes deployed at both the intra-image level and the inter-image level.
Experimental results on a number of biomedical datasets validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-01-12T04:41:23Z) - Predicting Scientific Impact Through Diffusion, Conformity, and Contribution Disentanglement [11.684776349325887]
Existing models typically rely on static graphs for citation count estimation.
We introduce a novel model, DPPDCC, which Disentangles the Potential impacts of Papers into Diffusion, Conformity, and Contribution values.
arXiv Detail & Related papers (2023-11-15T07:21:11Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - PcMSP: A Dataset for Scientific Action Graphs Extraction from
Polycrystalline Materials Synthesis Procedure Text [1.9573380763700712]
This dataset simultaneously contains the synthesis sentences extracted from the experimental paragraphs, as well as the entity mentions and intra-sentence relations.
A two-step human annotation and inter-annotator agreement study guarantee the high quality of the PcMSP corpus.
We introduce four natural language processing tasks: sentence classification, named entity recognition, relation classification, and joint extraction of entities and relations.
arXiv Detail & Related papers (2022-10-22T09:43:54Z) - Implications of Topological Imbalance for Representation Learning on
Biomedical Knowledge Graphs [16.566710222582618]
We show how knowledge graph embedding models can be affected by structural imbalance.
We show how the graph topology can be perturbed to artificially alter the rank of a gene via random, biologically meaningless information.
arXiv Detail & Related papers (2021-12-13T11:20:36Z) - Joint Biomedical Entity and Relation Extraction with Knowledge-Enhanced
Collective Inference [42.255596963210564]
We present a novel framework that utilizes external knowledge for joint entity and relation extraction named KECI.
KeCI takes a collective approach to link mention spans to entities by integrating global relational information into local representations.
Our experimental results show that the framework is highly effective, achieving new state-of-the-art results in two different benchmark datasets.
arXiv Detail & Related papers (2021-05-27T21:33:34Z) - Hyperbolic Graph Embedding with Enhanced Semi-Implicit Variational
Inference [48.63194907060615]
We build off of semi-implicit graph variational auto-encoders to capture higher-order statistics in a low-dimensional graph latent representation.
We incorporate hyperbolic geometry in the latent space through a Poincare embedding to efficiently represent graphs exhibiting hierarchical structure.
arXiv Detail & Related papers (2020-10-31T05:48:34Z) - HittER: Hierarchical Transformers for Knowledge Graph Embeddings [85.93509934018499]
We propose Hitt to learn representations of entities and relations in a complex knowledge graph.
Experimental results show that Hitt achieves new state-of-the-art results on multiple link prediction.
We additionally propose a simple approach to integrate Hitt into BERT and demonstrate its effectiveness on two Freebase factoid answering datasets.
arXiv Detail & Related papers (2020-08-28T18:58:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.