Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset
and Transformer-Based Results
- URL: http://arxiv.org/abs/2109.10453v1
- Date: Tue, 21 Sep 2021 22:54:09 GMT
- Title: Extracting Fine-Grained Knowledge Graphs of Scientific Claims: Dataset
and Transformer-Based Results
- Authors: Ian H. Magnusson and Scott E. Friedman
- Abstract summary: We build SciClaim, a dataset of scientific claims drawn from Social and Behavior Science (SBS), PubMed, and CORD-19 papers.
Our novel graph annotation schema incorporates not only coarse-grained entity spans as nodes and relations as edges between them, but also fine-grained attributes that modify entities and their relations.
By including more label types and more than twice the label density of previous datasets, SciClaim captures causal, comparative, predictive, statistical, and proportional associations over experimental variables along with their qualifications, subtypes, and evidence.
- Score: 0.5710971447109948
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent transformer-based approaches demonstrate promising results on
relational scientific information extraction. Existing datasets focus on
high-level description of how research is carried out. Instead we focus on the
subtleties of how experimental associations are presented by building SciClaim,
a dataset of scientific claims drawn from Social and Behavior Science (SBS),
PubMed, and CORD-19 papers. Our novel graph annotation schema incorporates not
only coarse-grained entity spans as nodes and relations as edges between them,
but also fine-grained attributes that modify entities and their relations, for
a total of 12,738 labels in the corpus. By including more label types and more
than twice the label density of previous datasets, SciClaim captures causal,
comparative, predictive, statistical, and proportional associations over
experimental variables along with their qualifications, subtypes, and evidence.
We extend work in transformer-based joint entity and relation extraction to
effectively infer our schema, showing the promise of fine-grained knowledge
graphs in scientific claims and beyond.
Related papers
- The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges [101.83124435649358]
Homophily principle, ie nodes with the same labels or similar attributes are more likely to be connected.
Recent work has identified a non-trivial set of datasets where GNN's performance compared to the NN's is not satisfactory.
arXiv Detail & Related papers (2024-07-12T18:04:32Z) - Extracting Protein-Protein Interactions (PPIs) from Biomedical
Literature using Attention-based Relational Context Information [5.456047952635665]
This work presents a unified, multi-source PPI corpora with vetted interaction definitions augmented by binary interaction type labels.
A Transformer-based deep learning method exploits entities' relational context information for relation representation to improve relation classification performance.
The model's performance is evaluated on four widely studied biomedical relation extraction datasets.
arXiv Detail & Related papers (2024-03-08T01:43:21Z) - Graph Relation Distillation for Efficient Biomedical Instance
Segmentation [80.51124447333493]
We propose a graph relation distillation approach for efficient biomedical instance segmentation.
We introduce two graph distillation schemes deployed at both the intra-image level and the inter-image level.
Experimental results on a number of biomedical datasets validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-01-12T04:41:23Z) - Enhancing Embedding Representations of Biomedical Data using Logic
Knowledge [6.295638112781736]
In this paper, we exploit logic rules to enhance the embedding representations of knowledge graph models on the PharmKG dataset.
An R2N uses the available logic rules to build a neural architecture that reasons over KGE latent representations.
In the experiments, we show that our approach is able to significantly improve the current state-of-the-art on the PharmKG dataset.
arXiv Detail & Related papers (2023-03-23T13:38:21Z) - Drug Synergistic Combinations Predictions via Large-Scale Pre-Training
and Graph Structure Learning [82.93806087715507]
Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation.
Deep learning models have emerged as an efficient way to discover synergistic combinations.
Our framework achieves state-of-the-art results in comparison with other deep learning-based methods.
arXiv Detail & Related papers (2023-01-14T15:07:43Z) - PcMSP: A Dataset for Scientific Action Graphs Extraction from
Polycrystalline Materials Synthesis Procedure Text [1.9573380763700712]
This dataset simultaneously contains the synthesis sentences extracted from the experimental paragraphs, as well as the entity mentions and intra-sentence relations.
A two-step human annotation and inter-annotator agreement study guarantee the high quality of the PcMSP corpus.
We introduce four natural language processing tasks: sentence classification, named entity recognition, relation classification, and joint extraction of entities and relations.
arXiv Detail & Related papers (2022-10-22T09:43:54Z) - Implications of Topological Imbalance for Representation Learning on
Biomedical Knowledge Graphs [16.566710222582618]
We show how knowledge graph embedding models can be affected by structural imbalance.
We show how the graph topology can be perturbed to artificially alter the rank of a gene via random, biologically meaningless information.
arXiv Detail & Related papers (2021-12-13T11:20:36Z) - Joint Biomedical Entity and Relation Extraction with Knowledge-Enhanced
Collective Inference [42.255596963210564]
We present a novel framework that utilizes external knowledge for joint entity and relation extraction named KECI.
KeCI takes a collective approach to link mention spans to entities by integrating global relational information into local representations.
Our experimental results show that the framework is highly effective, achieving new state-of-the-art results in two different benchmark datasets.
arXiv Detail & Related papers (2021-05-27T21:33:34Z) - Hyperbolic Graph Embedding with Enhanced Semi-Implicit Variational
Inference [48.63194907060615]
We build off of semi-implicit graph variational auto-encoders to capture higher-order statistics in a low-dimensional graph latent representation.
We incorporate hyperbolic geometry in the latent space through a Poincare embedding to efficiently represent graphs exhibiting hierarchical structure.
arXiv Detail & Related papers (2020-10-31T05:48:34Z) - Cross-Supervised Joint-Event-Extraction with Heterogeneous Information
Networks [61.950353376870154]
Joint-event-extraction is a sequence-to-sequence labeling task with a tag set composed of tags of triggers and entities.
We propose a Cross-Supervised Mechanism (CSM) to alternately supervise the extraction of triggers or entities.
Our approach outperforms the state-of-the-art methods in both entity and trigger extraction.
arXiv Detail & Related papers (2020-10-13T11:51:17Z) - HittER: Hierarchical Transformers for Knowledge Graph Embeddings [85.93509934018499]
We propose Hitt to learn representations of entities and relations in a complex knowledge graph.
Experimental results show that Hitt achieves new state-of-the-art results on multiple link prediction.
We additionally propose a simple approach to integrate Hitt into BERT and demonstrate its effectiveness on two Freebase factoid answering datasets.
arXiv Detail & Related papers (2020-08-28T18:58:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.