Related papers: GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer

GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer

URL: http://arxiv.org/abs/2311.08526v1
Date: Tue, 14 Nov 2023 20:39:12 GMT
Title: GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer
Authors: Urchade Zaratiana, Nadi Tomeh, Pierre Holat, Thierry Charnois
Abstract summary: Named Entity Recognition (NER) is essential in various Natural Language Processing (NLP) applications. In this paper, we introduce a compact NER model trained to identify any type of entity. Our model, GLiNER, facilitates parallel entity extraction, an advantage over the slow sequential token generation of Large Language Models (LLMs)
Score: 4.194768796374315
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Named Entity Recognition (NER) is essential in various Natural Language Processing (NLP) applications. Traditional NER models are effective but limited to a set of predefined entity types. In contrast, Large Language Models (LLMs) can extract arbitrary entities through natural language instructions, offering greater flexibility. However, their size and cost, particularly for those accessed via APIs like ChatGPT, make them impractical in resource-limited scenarios. In this paper, we introduce a compact NER model trained to identify any type of entity. Leveraging a bidirectional transformer encoder, our model, GLiNER, facilitates parallel entity extraction, an advantage over the slow sequential token generation of LLMs. Through comprehensive testing, GLiNER demonstrate strong performance, outperforming both ChatGPT and fine-tuned LLMs in zero-shot evaluations on various NER benchmarks.

Related papers

GLiNER2: An Efficient Multi-Task Information Extraction System with Schema-Driven Interface [0.873811641236639]
We present GLiNER2, a unified framework that enhances the original GLiNER architecture to support named entity recognition, text classification, and hierarchical structured data extraction.<n>Our experiments demonstrate competitive performance across extraction and classification tasks with substantial improvements in deployment accessibility.
arXiv Detail & Related papers (2025-07-24T16:11:14Z)
GLiNER-biomed: A Suite of Efficient Models for Open Biomedical Named Entity Recognition [0.06554326244334868]
We introduce GLiNER-biomed, a domain-adapted suite of Generalist and Lightweight Model for NER (GLiNER) models specifically tailored for biomedical NER. In contrast to conventional approaches, GLiNER uses natural language descriptions to infer arbitrary entity types, enabling zero-shot recognition. Evaluations on several biomedical datasets demonstrate that GLiNER-biomed outperforms state-of-the-art GLiNER models in both zero- and few-shot scenarios.
arXiv Detail & Related papers (2025-04-01T11:40:50Z)
Scalable Language Models with Posterior Inference of Latent Thought Vectors [52.63299874322121]
Latent-Thought Language Models (LTMs) incorporate explicit latent thought vectors that follow an explicit prior model in latent space. LTMs possess additional scaling dimensions beyond traditional LLMs, yielding a structured design space. LTMs significantly outperform conventional autoregressive models and discrete diffusion models in validation perplexity and zero-shot language modeling.
arXiv Detail & Related papers (2025-02-03T17:50:34Z)
ReverseNER: A Self-Generated Example-Driven Framework for Zero-Shot Named Entity Recognition with Large Language Models [0.0]
We present ReverseNER, a framework aimed at overcoming the limitations of large language models (LLMs) in zero-shot Named Entity Recognition tasks. Rather than beginning with sentences, this method uses an LLM to generate entities based on their definitions and then expands them into full sentences. This results in well-annotated sentences with clearly labeled entities, while preserving semantic and structural similarity to the task sentences.
arXiv Detail & Related papers (2024-11-01T12:08:08Z)
GEIC: Universal and Multilingual Named Entity Recognition with Large Language Models [7.714969840571947]
We introduce the task of generation-based extraction and in-context classification (GEIC) We then propose CascadeNER, a universal and multilingual GEIC framework for few-shot and zero-shot NER. We also introduce AnythingNER, the first NER dataset specifically designed for Large Language Models (LLMs)
arXiv Detail & Related papers (2024-09-17T09:32:12Z)
llmNER: (Zero|Few)-Shot Named Entity Recognition, Exploiting the Power of Large Language Models [1.1196013962698619]
This paper presents llmNER, a Python library for implementing zero-shot and few-shot NER with large language models (LLMs) llmNER can compose prompts, query the model, and parse the completion returned by the LLM. We validated our software on two NER tasks to show the library's flexibility.
arXiv Detail & Related papers (2024-06-06T22:01:59Z)
NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data [41.94295877935867]
We show how to create NuNER, a compact language representation model specialized in the Named Entity Recognition task. NuNER can be fine-tuned to solve downstream NER problems in a data-efficient way. We find that the size and entity-type diversity of the pre-training dataset are key to achieving good performance.
arXiv Detail & Related papers (2024-02-23T14:23:51Z)
In-Context Learning for Few-Shot Nested Named Entity Recognition [53.55310639969833]
We introduce an effective and innovative ICL framework for the setting of few-shot nested NER. We improve the ICL prompt by devising a novel example demonstration selection mechanism, EnDe retriever. In EnDe retriever, we employ contrastive learning to perform three types of representation learning, in terms of semantic similarity, boundary similarity, and label similarity.
arXiv Detail & Related papers (2024-02-02T06:57:53Z)
NERetrieve: Dataset for Next Generation Named Entity Recognition and Retrieval [49.827932299460514]
We argue that capabilities provided by large language models are not the end of NER research, but rather an exciting beginning. We present three variants of the NER task, together with a dataset to support them. We provide a large, silver-annotated corpus of 4 million paragraphs covering 500 entity types.
arXiv Detail & Related papers (2023-10-22T12:23:00Z)
MProto: Multi-Prototype Network with Denoised Optimal Transport for Distantly Supervised Named Entity Recognition [75.87566793111066]
We propose a noise-robust prototype network named MProto for the DS-NER task. MProto represents each entity type with multiple prototypes to characterize the intra-class variance. To mitigate the noise from incomplete labeling, we propose a novel denoised optimal transport (DOT) algorithm.
arXiv Detail & Related papers (2023-10-12T13:02:34Z)
UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition [48.977866466971655]
We show how ChatGPT can be distilled into much smaller UniversalNER models for open NER. We assemble the largest NER benchmark to date, comprising 43 datasets across 9 diverse domains. With a tiny fraction of parameters, UniversalNER not only acquires ChatGPT's capability in recognizing arbitrary entity types, but also outperforms its NER accuracy by 7-9 absolute F1 points in average.
arXiv Detail & Related papers (2023-08-07T03:39:52Z)
GPT-NER: Named Entity Recognition via Large Language Models [58.609582116612934]
GPT-NER transforms the sequence labeling task to a generation task that can be easily adapted by Language Models. We find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce. This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.
arXiv Detail & Related papers (2023-04-20T16:17:26Z)
Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box. Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types. We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.