Combining Language and Graph Models for Semi-structured Information
Extraction on the Web
- URL: http://arxiv.org/abs/2402.14129v1
- Date: Wed, 21 Feb 2024 20:53:29 GMT
- Title: Combining Language and Graph Models for Semi-structured Information
Extraction on the Web
- Authors: Zhi Hong, Kyle Chard and Ian Foster
- Abstract summary: We present GraphScholarBERT, an open-domain information extraction method based on a joint graph and language model structure.
Experiments show that GraphScholarBERT can improve extraction F1 scores by as much as 34.8% compared to previous work in a zero-shot domain and zero-shot website setting.
- Score: 7.44454462555094
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Relation extraction is an efficient way of mining the extraordinary wealth of
human knowledge on the Web. Existing methods rely on domain-specific training
data or produce noisy outputs. We focus here on extracting targeted relations
from semi-structured web pages given only a short description of the relation.
We present GraphScholarBERT, an open-domain information extraction method based
on a joint graph and language model structure. GraphScholarBERT can generalize
to previously unseen domains without additional data or training and produces
only clean extraction results matched to the search keyword. Experiments show
that GraphScholarBERT can improve extraction F1 scores by as much as 34.8\%
compared to previous work in a zero-shot domain and zero-shot website setting.
Related papers
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction [3.579132482505273]
Information extraction is an important task in Natural Language Processing (NLP)
We propose a novel approach to this task by formulating it as graph structure learning (GSL)
This formulation allows for better interaction and structure-informed decisions for entity and relation prediction.
arXiv Detail & Related papers (2024-04-18T20:09:37Z) - Distantly Supervised Morpho-Syntactic Model for Relation Extraction [0.27195102129094995]
We present a method for the extraction and categorisation of an unrestricted set of relationships from text.
We evaluate our approach on six datasets built on Wikidata and Wikipedia.
arXiv Detail & Related papers (2024-01-18T14:17:40Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - Scientific Paper Extractive Summarization Enhanced by Citation Graphs [50.19266650000948]
We focus on leveraging citation graphs to improve scientific paper extractive summarization under different settings.
Preliminary results demonstrate that citation graph is helpful even in a simple unsupervised framework.
Motivated by this, we propose a Graph-based Supervised Summarization model (GSS) to achieve more accurate results on the task when large-scale labeled data are available.
arXiv Detail & Related papers (2022-12-08T11:53:12Z) - A Graph-Enhanced Click Model for Web Search [67.27218481132185]
We propose a novel graph-enhanced click model (GraphCM) for web search.
We exploit both intra-session and inter-session information for the sparsity and cold-start problems.
arXiv Detail & Related papers (2022-06-17T08:32:43Z) - MORE: A Metric Learning Based Framework for Open-domain Relation
Extraction [25.149590577718996]
Open relation extraction (OpenRE) is the task of extracting relation schemes from open-domain corpora.
We propose a novel learning framework named MORE (Metric learning-based Open Relation Extraction)
arXiv Detail & Related papers (2022-06-01T07:51:20Z) - D-REX: Dialogue Relation Extraction with Explanations [65.3862263565638]
This work focuses on extracting explanations that indicate that a relation exists while using only partially labeled data.
We propose our model-agnostic framework, D-REX, a policy-guided semi-supervised algorithm that explains and ranks relations.
We find that about 90% of the time, human annotators prefer D-REX's explanations over a strong BERT-based joint relation extraction and explanation model.
arXiv Detail & Related papers (2021-09-10T22:30:48Z) - WebRED: Effective Pretraining And Finetuning For Relation Extraction On
The Web [4.702325864333419]
WebRED is a strongly-supervised human annotated dataset for extracting relationships from text found on the World Wide Web.
We show that combining pre-training on a large weakly supervised dataset with fine-tuning on a small strongly-supervised dataset leads to better relation extraction performance.
arXiv Detail & Related papers (2021-02-18T23:56:12Z) - ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured
Webpages [66.45377533562417]
We propose a solution for "zero-shot" open-domain relation extraction from webpages with a previously unseen template.
Our model uses a graph neural network-based approach to build a rich representation of text fields on a webpage.
arXiv Detail & Related papers (2020-05-14T16:15:58Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.