Informed Named Entity Recognition Decoding for Generative Language
Models
- URL: http://arxiv.org/abs/2308.07791v1
- Date: Tue, 15 Aug 2023 14:16:29 GMT
- Title: Informed Named Entity Recognition Decoding for Generative Language
Models
- Authors: Tobias Deu{\ss}er, Lars Hillebrand, Christian Bauckhage, Rafet Sifa
- Abstract summary: We propose Informed Named Entity Recognition Decoding (iNERD), which treats named entity recognition as a generative process.
We coarse-tune our model on a merged named entity corpus to strengthen its performance, evaluate five generative language models on eight named entity recognition datasets, and achieve remarkable results.
- Score: 3.5323691899538128
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ever-larger language models with ever-increasing capabilities are by now
well-established text processing tools. Alas, information extraction tasks such
as named entity recognition are still largely unaffected by this progress as
they are primarily based on the previous generation of encoder-only transformer
models. Here, we propose a simple yet effective approach, Informed Named Entity
Recognition Decoding (iNERD), which treats named entity recognition as a
generative process. It leverages the language understanding capabilities of
recent generative models in a future-proof manner and employs an informed
decoding scheme incorporating the restricted nature of information extraction
into open-ended text generation, improving performance and eliminating any risk
of hallucinations. We coarse-tune our model on a merged named entity corpus to
strengthen its performance, evaluate five generative language models on eight
named entity recognition datasets, and achieve remarkable results, especially
in an environment with an unknown entity class set, demonstrating the
adaptability of the approach.
Related papers
- Retrieval-Enhanced Named Entity Recognition [1.2187048691454239]
RENER is a technique for named entity recognition using autoregressive language models based on In-Context Learning and information retrieval techniques.
Experimental results show that in the CrossNER collection we achieve state-of-the-art performance with the proposed technique.
arXiv Detail & Related papers (2024-10-17T01:12:48Z) - ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models [25.68491572293656]
Large Language Models fall short in structured knowledge extraction tasks such as named entity recognition.
This paper explores an innovative, cost-efficient strategy to harness LLMs with modest NER capabilities for producing superior NER datasets.
arXiv Detail & Related papers (2024-03-17T06:12:43Z) - Enhancing Retrieval-Augmented Large Language Models with Iterative
Retrieval-Generation Synergy [164.83371924650294]
We show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner.
A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge.
Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints.
arXiv Detail & Related papers (2023-05-24T16:17:36Z) - GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator [114.8954615026781]
We propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator.
GanLM is trained with two pre-training objectives: replaced token detection and replaced token denoising.
Experiments in language generation benchmarks show that GanLM with the powerful language understanding capability outperforms various strong pre-trained language models.
arXiv Detail & Related papers (2022-12-20T12:51:11Z) - Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph
Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction.
RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z) - End-to-End Entity Detection with Proposer and Regressor [6.25916397918329]
nested entity recognition receives extensive attention for the widespread existence of the nesting scenario.
An end-to-end entity detection approach with proposer and regressor is presented in this paper to tackle the issues.
Our model achieves advanced performance in flat and nested NER, achieving a new state-of-the-art F1 score of 80.74 on the GENIA dataset and 72.38 on the WeiboNER dataset.
arXiv Detail & Related papers (2022-10-19T02:42:46Z) - Efficient and Interpretable Neural Models for Entity Tracking [3.1985066117432934]
This thesis focuses on two key problems in relation to facilitating the use of entity tracking models.
We argue that computationally efficient entity tracking models can be developed by representing entities with rich, fixed-dimensional vector representations.
We also argue for the integration of entity tracking into language models as it will allow for: (i) wider application given the current ubiquitous use of pretrained language models in NLP applications.
arXiv Detail & Related papers (2022-08-30T13:25:27Z) - Distantly-Supervised Named Entity Recognition with Noise-Robust Learning
and Language Model Augmented Self-Training [66.80558875393565]
We study the problem of training named entity recognition (NER) models using only distantly-labeled data.
We propose a noise-robust learning scheme comprised of a new loss function and a noisy label removal step.
Our method achieves superior performance, outperforming existing distantly-supervised NER models by significant margins.
arXiv Detail & Related papers (2021-09-10T17:19:56Z) - Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box.
Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types.
We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.