Related papers: Strategies for Span Labeling with Large Language Models

Strategies for Span Labeling with Large Language Models

URL: http://arxiv.org/abs/2601.16946v1
Date: Fri, 23 Jan 2026 18:03:10 GMT
Title: Strategies for Span Labeling with Large Language Models
Authors: Danil Semin, Ondřej Dušek, Zdeněk Kasner,
Abstract summary: Large language models (LLMs) are increasingly used for text analysis tasks, such as named entity recognition or error detection.<n>Unlike encoder-based models, generative architectures lack an explicit mechanism to refer to specific parts of their input.<n>In this paper, we categorize these strategies into three families: tagging the input text, indexing numerical positions of spans, and matching span content.<n>To address the limitations of content matching, we introduce LogitMatch, a new constrained decoding method that forces the model's output to align with valid input spans.
Score: 0.19116784879310025
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are increasingly used for text analysis tasks, such as named entity recognition or error detection. Unlike encoder-based models, however, generative architectures lack an explicit mechanism to refer to specific parts of their input. This leads to a variety of ad-hoc prompting strategies for span labeling, often with inconsistent results. In this paper, we categorize these strategies into three families: tagging the input text, indexing numerical positions of spans, and matching span content. To address the limitations of content matching, we introduce LogitMatch, a new constrained decoding method that forces the model's output to align with valid input spans. We evaluate all methods across four diverse tasks. We find that while tagging remains a robust baseline, LogitMatch improves upon competitive matching-based methods by eliminating span matching issues and outperforms other strategies in some setups.

Related papers

MultiMatch: Multihead Consistency Regularization Matching for Semi-Supervised Text Classification [41.135013117834795]
We introduce MultiMatch, a novel semi-supervised learning (SSL) algorithm combining the paradigms of co-training and consistency regularization with pseudo-labeling.<n>At its core, MultiMatch features a pseudo-label weighting module designed for selecting and filtering pseudo-labels based on head agreement and model confidence.
arXiv Detail & Related papers (2025-06-09T14:27:47Z)
Leveraging Annotator Disagreement for Text Classification [3.6625157427847963]
It is common practice in text classification to only use one majority label for model training even if a dataset has been annotated by multiple annotators. This paper proposes three strategies to leverage annotator disagreement for text classification: a probability-based multi-label method, an ensemble system, and instruction tuning.
arXiv Detail & Related papers (2024-09-26T06:46:53Z)
A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models. We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z)
Towards Unsupervised Recognition of Token-level Semantic Differences in Related Documents [61.63208012250885]
We formulate recognizing semantic differences as a token-level regression task. We study three unsupervised approaches that rely on a masked language model. Our results show that an approach based on word alignment and sentence-level contrastive learning has a robust correlation to gold labels.
arXiv Detail & Related papers (2023-05-22T17:58:04Z)
A Structured Span Selector [100.0808682810258]
We propose a novel grammar-based structured span selection model. We evaluate our model on two popular span prediction tasks: coreference resolution and semantic role labeling.
arXiv Detail & Related papers (2022-05-08T23:58:40Z)
Decomposed Meta-Learning for Few-Shot Named Entity Recognition [32.515795881027074]
Few-shot named entity recognition (NER) systems aim at recognizing novel-class named entities based on only a few labeled examples. We present a meta-learning approach which tackles few-shot span detection and few-shot entity typing using meta-learning.
arXiv Detail & Related papers (2022-04-12T12:46:23Z)
Contextualizing Meta-Learning via Learning to Decompose [125.76658595408607]
We propose Learning to Decompose Network (LeadNet) to contextualize the meta-learned support-to-target'' strategy. LeadNet learns to automatically select the strategy associated with the right via incorporating the change of comparison across contexts with polysemous embeddings.
arXiv Detail & Related papers (2021-06-15T13:10:56Z)
Few-shot Intent Classification and Slot Filling with Retrieved Examples [30.45269507626138]
We propose a span-level retrieval method that learns similar contextualized representations for spans with the same label via a novel batch-softmax objective. Our method outperforms previous systems in various few-shot settings on the CLINC and SNIPS benchmarks.
arXiv Detail & Related papers (2021-04-12T18:50:34Z)
Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains [51.91456788949489]
We propose a novel match method tailored for text matching in asymmetrical domains, called WD-Match. In WD-Match, a Wasserstein distance-based regularizer is defined to regularize the features vectors projected from different domains. The training process of WD-Match amounts to a game that minimizes the matching loss regularized by the Wasserstein distance.
arXiv Detail & Related papers (2020-10-15T12:52:09Z)
Dynamic Semantic Matching and Aggregation Network for Few-shot Intent Detection [69.2370349274216]
Few-shot Intent Detection is challenging due to the scarcity of available annotated utterances. Semantic components are distilled from utterances via multi-head self-attention. Our method provides a comprehensive matching measure to enhance representations of both labeled and unlabeled instances.
arXiv Detail & Related papers (2020-10-06T05:16:38Z)
MultiGBS: A multi-layer graph approach to biomedical summarization [6.11737116137921]
We propose a domain-specific method that models a document as a multi-layer graph to enable multiple features of the text to be processed at the same time. The unsupervised method selects sentences from the multi-layer graph based on the MultiRank algorithm and the number of concepts. The proposed MultiGBS algorithm employs UMLS and extracts the concepts and relationships using different tools such as SemRep, MetaMap, and OGER.
arXiv Detail & Related papers (2020-08-27T04:22:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.