Related papers: OntoType: Ontology-Guided and Pre-Trained Language Model Assisted Fine-Grained Entity Typing

OntoType: Ontology-Guided and Pre-Trained Language Model Assisted Fine-Grained Entity Typing

URL: http://arxiv.org/abs/2305.12307v3
Date: Tue, 11 Jun 2024 16:16:19 GMT
Title: OntoType: Ontology-Guided and Pre-Trained Language Model Assisted Fine-Grained Entity Typing
Authors: Tanay Komarlu, Minhao Jiang, Xuan Wang, Jiawei Han,
Abstract summary: Fine-grained entity typing (FET) assigns entities in text with context-sensitive, fine-grained semantic types. OntoType follows a type ontological structure, from coarse to fine, ensembles multiple PLM prompting results to generate a set of type candidates. Our experiments on the Ontonotes, FIGER, and NYT datasets demonstrate that our method outperforms the state-of-the-art zero-shot fine-grained entity typing methods.
Score: 25.516304052884397
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Fine-grained entity typing (FET), which assigns entities in text with context-sensitive, fine-grained semantic types, is a basic but important task for knowledge extraction from unstructured text. FET has been studied extensively in natural language processing and typically relies on human-annotated corpora for training, which is costly and difficult to scale. Recent studies explore the utilization of pre-trained language models (PLMs) as a knowledge base to generate rich and context-aware weak supervision for FET. However, a PLM still requires direction and guidance to serve as a knowledge base as they often generate a mixture of rough and fine-grained types, or tokens unsuitable for typing. In this study, we vision that an ontology provides a semantics-rich, hierarchical structure, which will help select the best results generated by multiple PLM models and head words. Specifically, we propose a novel annotation-free, ontology-guided FET method, OntoType, which follows a type ontological structure, from coarse to fine, ensembles multiple PLM prompting results to generate a set of type candidates, and refines its type resolution, under the local context with a natural language inference model. Our experiments on the Ontonotes, FIGER, and NYT datasets using their associated ontological structures demonstrate that our method outperforms the state-of-the-art zero-shot fine-grained entity typing methods as well as a typical LLM method, ChatGPT. Our error analysis shows that refinement of the existing ontology structures will further improve fine-grained entity typing.

Related papers

Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
Seed-Guided Fine-Grained Entity Typing in Science and Engineering Domains [51.02035914828596]
We study the task of seed-guided fine-grained entity typing in science and engineering domains. We propose SEType which first enriches the weak supervision by finding more entities for each seen type from an unlabeled corpus. It then matches the enriched entities to unlabeled text to get pseudo-labeled samples and trains a textual entailment model that can make inferences for both seen and unseen types.
arXiv Detail & Related papers (2024-01-23T22:36:03Z)
Ontology Enrichment for Effective Fine-grained Entity Typing [45.356694904518626]
Fine-grained entity typing (FET) is the task of identifying specific entity types at a fine-grained level for entity mentions based on their contextual information. Conventional methods for FET require extensive human annotation, which is time-consuming and costly. We develop a coarse-to-fine typing algorithm that exploits the enriched information by training an entailment model with contrasting topics and instance-based augmented training samples.
arXiv Detail & Related papers (2023-10-11T18:30:37Z)
Physics of Language Models: Part 1, Learning Hierarchical Language Structures [51.68385617116854]
Transformer-based language models are effective but complex, and understanding their inner workings is a significant challenge. We introduce a family of synthetic CFGs that produce hierarchical rules, capable of generating lengthy sentences. We demonstrate that generative models like GPT can accurately learn this CFG language and generate sentences based on it.
arXiv Detail & Related papers (2023-05-23T04:28:16Z)
Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs. Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z)
Generative Entity Typing with Curriculum Learning [18.43562065432877]
We propose a novel generative entity typing (GET) paradigm. Given a text with an entity mention, the multiple types for the role that the entity plays in the text are generated with a pre-trained language model. Our experiments justify the superiority of our GET model over the state-of-the-art entity typing models.
arXiv Detail & Related papers (2022-10-06T13:32:50Z)
Ultra-fine Entity Typing with Indirect Supervision from Natural Language Inference [28.78215056129358]
This work presents LITE, a new approach that formulates entity typing as a natural language inference (NLI) problem. Experiments show that, with limited training data, LITE obtains state-of-the-art performance on the UFET task.
arXiv Detail & Related papers (2022-02-12T23:56:26Z)
Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box. Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types. We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.