BANER: Boundary-Aware LLMs for Few-Shot Named Entity Recognition
- URL: http://arxiv.org/abs/2412.02228v1
- Date: Tue, 03 Dec 2024 07:51:14 GMT
- Title: BANER: Boundary-Aware LLMs for Few-Shot Named Entity Recognition
- Authors: Quanjiang Guo, Yihong Dong, Ling Tian, Zhao Kang, Yu Zhang, Sijie Wang,
- Abstract summary: We propose an approach called Boundary-Aware LLMs for Few-Shot Named Entity Recognition.<n>We introduce a boundary-aware contrastive learning strategy to enhance the LLM's ability to perceive entity boundaries for generalized entity spans.<n>We utilize LoRAHub to align information from the target domain to the source domain, thereby enhancing adaptive cross-domain classification capabilities.
- Score: 12.57768435856206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the recent success of two-stage prototypical networks in few-shot named entity recognition (NER), challenges such as over/under-detected false spans in the span detection stage and unaligned entity prototypes in the type classification stage persist. Additionally, LLMs have not proven to be effective few-shot information extractors in general. In this paper, we propose an approach called Boundary-Aware LLMs for Few-Shot Named Entity Recognition to address these issues. We introduce a boundary-aware contrastive learning strategy to enhance the LLM's ability to perceive entity boundaries for generalized entity spans. Additionally, we utilize LoRAHub to align information from the target domain to the source domain, thereby enhancing adaptive cross-domain classification capabilities. Extensive experiments across various benchmarks demonstrate that our framework outperforms prior methods, validating its effectiveness. In particular, the proposed strategies demonstrate effectiveness across a range of LLM architectures. The code and data are released on https://github.com/UESTC-GQJ/BANER.
Related papers
- Shh, don't say that! Domain Certification in LLMs [124.61851324874627]
Large language models (LLMs) are often deployed to perform constrained tasks, with narrow domains.
We introduce domain certification; a guarantee that accurately characterizes the out-of-domain behavior of language models.
We then propose a simple yet effective approach, which we call VALID that provides adversarial bounds as a certificate.
arXiv Detail & Related papers (2025-02-26T17:13:19Z) - Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference [78.08901120841833]
We propose a method to detect the knowledge boundary of Visual Large Language Models (VLLMs)
We show that our method successfully depicts a VLLM's knowledge boundary based on which we are able to reduce indiscriminate retrieval while maintaining or improving the performance.
arXiv Detail & Related papers (2025-02-25T09:32:08Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.
LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.
Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - Collaborative Feature-Logits Contrastive Learning for Open-Set Semi-Supervised Object Detection [75.02249869573994]
In open-set scenarios, the unlabeled dataset contains both in-distribution (ID) classes and out-of-distribution (OOD) classes.
Applying semi-supervised detectors in such settings can lead to misclassifying OOD class as ID classes.
We propose a simple yet effective method, termed Collaborative Feature-Logits Detector (CFL-Detector)
arXiv Detail & Related papers (2024-11-20T02:57:35Z) - CLLMFS: A Contrastive Learning enhanced Large Language Model Framework for Few-Shot Named Entity Recognition [3.695767900907561]
CLLMFS is a Contrastive Learning enhanced Large Language Model framework for Few-Shot Named Entity Recognition.
It integrates Low-Rank Adaptation (LoRA) and contrastive learning mechanisms specifically tailored for few-shot NER.
Our method has achieved state-of-the-art performance improvements on F1-score ranging from 2.58% to 97.74% over existing best-performing methods.
arXiv Detail & Related papers (2024-08-23T04:44:05Z) - An Empirical Study of Automated Vulnerability Localization with Large Language Models [21.84971967029474]
Large Language Models (LLMs) have shown potential in various domains, yet their effectiveness in vulnerability localization remains underexplored.
Our investigation encompasses 10+ leading LLMs suitable for code analysis, including ChatGPT and various open-source models.
We explore the efficacy of these LLMs using 4 distinct paradigms: zero-shot learning, one-shot learning, discriminative fine-tuning, and generative fine-tuning.
arXiv Detail & Related papers (2024-03-30T08:42:10Z) - RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition [78.97487780589574]
Multimodal Large Language Models (MLLMs) excel at classifying fine-grained categories.
This paper introduces a Retrieving And Ranking augmented method for MLLMs.
Our proposed approach not only addresses the inherent limitations in fine-grained recognition but also preserves the model's comprehensive knowledge base.
arXiv Detail & Related papers (2024-03-20T17:59:55Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - One Model for All: Large Language Models are Domain-Agnostic Recommendation Systems [43.79001185418127]
This paper introduces a framework that utilizes pre-trained large language models (LLMs) for domain-agnostic recommendation.
Specifically, we mix user's behaviors from multiple domains and item titles into a sentence, then use LLMs for generating user and item representations.
arXiv Detail & Related papers (2023-10-22T13:56:14Z) - Exploring Few-Shot Adaptation for Activity Recognition on Diverse Domains [46.26074225989355]
Domain adaptation is essential for activity recognition to ensure accurate and robust performance across diverse environments.
In this work, we focus on FewShot Domain Adaptation for Activity Recognition (FSDA-AR), which leverages a very small amount of labeled target videos.
We propose a new FSDA-AR using five established datasets considering the adaptation on more diverse and challenging domains.
arXiv Detail & Related papers (2023-05-15T08:01:05Z) - SpanProto: A Two-stage Span-based Prototypical Network for Few-shot
Named Entity Recognition [45.012327072558975]
Few-shot Named Entity Recognition (NER) aims to identify named entities with very little annotated data.
We propose a seminal span-based prototypical network (SpanProto) that tackles few-shot NER via a two-stage approach.
In the span extraction stage, we transform the sequential tags into a global boundary matrix, enabling the model to focus on the explicit boundary information.
For mention classification, we leverage prototypical learning to capture the semantic representations for each labeled span and make the model better adapt to novel-class entities.
arXiv Detail & Related papers (2022-10-17T12:59:33Z) - Decomposed Meta-Learning for Few-Shot Named Entity Recognition [32.515795881027074]
Few-shot named entity recognition (NER) systems aim at recognizing novel-class named entities based on only a few labeled examples.
We present a meta-learning approach which tackles few-shot span detection and few-shot entity typing using meta-learning.
arXiv Detail & Related papers (2022-04-12T12:46:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.