iobes: A Library for Span-Level Processing
- URL: http://arxiv.org/abs/2010.04373v1
- Date: Fri, 9 Oct 2020 05:03:48 GMT
- Title: iobes: A Library for Span-Level Processing
- Authors: Brian Lester
- Abstract summary: iobe is used for parsing, converting, and processing spans represented as token level decisions.
In this paper, we introduce our open remedy, iobe.
- Score: 11.112281331309939
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many tasks in natural language processing, such as named entity recognition
and slot-filling, involve identifying and labeling specific spans of text. In
order to leverage common models, these tasks are often recast as sequence
labeling tasks. Each token is given a label and these labels are prefixed with
special tokens such as B- or I-. After a model assigns labels to each token,
these prefixes are used to group the tokens into spans.
Properly parsing these annotations is critical for producing fair and
comparable metrics; however, despite its importance, there is not an
easy-to-use, standardized, programmatically integratable library to help work
with span labeling. To remedy this, we introduce our open-source library,
iobes. iobes is used for parsing, converting, and processing spans represented
as token level decisions.
Related papers
- Exploiting Conjugate Label Information for Multi-Instance Partial-Label Learning [61.00359941983515]
Multi-instance partial-label learning (MIPL) addresses scenarios where each training sample is represented as a multi-instance bag associated with a candidate label set containing one true label and several false positives.
ELIMIPL exploits the conjugate label information to improve the disambiguation performance.
arXiv Detail & Related papers (2024-08-26T15:49:31Z) - Don't Waste a Single Annotation: Improving Single-Label Classifiers
Through Soft Labels [7.396461226948109]
We address the limitations of the common data annotation and training methods for objective single-label classification tasks.
Our findings indicate that additional annotator information, such as confidence, secondary label and disagreement, can be used to effectively generate soft labels.
arXiv Detail & Related papers (2023-11-09T10:47:39Z) - Substituting Data Annotation with Balanced Updates and Collective Loss
in Multi-label Text Classification [19.592985329023733]
Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text.
We study the MLTC problem in annotation-free and scarce-annotation settings in which the magnitude of available supervision signals is linear to the number of labels.
Our method follows three steps, (1) mapping input text into a set of preliminary label likelihoods by natural language inference using a pre-trained language model, (2) calculating a signed label dependency graph by label descriptions, and (3) updating the preliminary label likelihoods with message passing along the label dependency graph.
arXiv Detail & Related papers (2023-09-24T04:12:52Z) - Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations [91.67511167969934]
imprecise label learning (ILL) is a framework for the unification of learning with various imprecise label configurations.
We demonstrate that ILL can seamlessly adapt to partial label learning, semi-supervised learning, noisy label learning, and, more importantly, a mixture of these settings.
arXiv Detail & Related papers (2023-05-22T04:50:28Z) - Label2Label: A Language Modeling Framework for Multi-Attribute Learning [93.68058298766739]
Label2Label is the first attempt for multi-attribute prediction from the perspective of language modeling.
Inspired by the success of pre-training language models in NLP, Label2Label introduces an image-conditioned masked language model.
Our intuition is that the instance-wise attribute relations are well grasped if the neural net can infer the missing attributes based on the context and the remaining attribute hints.
arXiv Detail & Related papers (2022-07-18T15:12:33Z) - Towards Few-shot Entity Recognition in Document Images: A Label-aware
Sequence-to-Sequence Framework [28.898240725099782]
We build an entity recognition model requiring only a few shots of annotated document images.
We develop a novel label-aware seq2seq framework, LASER.
Experiments on two benchmark datasets demonstrate the superiority of LASER under the few-shot setting.
arXiv Detail & Related papers (2022-03-30T18:30:42Z) - Label Semantics for Few Shot Named Entity Recognition [68.01364012546402]
We study the problem of few shot learning for named entity recognition.
We leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors.
Our model learns to match the representations of named entities computed by the first encoder with label representations computed by the second encoder.
arXiv Detail & Related papers (2022-03-16T23:21:05Z) - Structured Semantic Transfer for Multi-Label Recognition with Partial
Labels [85.6967666661044]
We propose a structured semantic transfer (SST) framework that enables training multi-label recognition models with partial labels.
The framework consists of two complementary transfer modules that explore within-image and cross-image semantic correlations.
Experiments on the Microsoft COCO, Visual Genome and Pascal VOC datasets show that the proposed SST framework obtains superior performance over current state-of-the-art algorithms.
arXiv Detail & Related papers (2021-12-21T02:15:01Z) - Label-Wise Document Pre-Training for Multi-Label Text Classification [14.439051753832032]
This paper develops Label-Wise Pre-Training (LW-PT) method to get a document representation with label-aware information.
The basic idea is that, a multi-label document can be represented as a combination of multiple label-wise representations, and that, correlated labels always cooccur in the same or similar documents.
arXiv Detail & Related papers (2020-08-15T10:34:27Z) - Few-shot Slot Tagging with Collapsed Dependency Transfer and
Label-enhanced Task-adaptive Projection Network [61.94394163309688]
We propose a Label-enhanced Task-Adaptive Projection Network (L-TapNet) based on the state-of-the-art few-shot classification model -- TapNet.
Experimental results show that our model significantly outperforms the strongest few-shot learning baseline by 14.64 F1 scores in the one-shot setting.
arXiv Detail & Related papers (2020-06-10T07:50:44Z) - Embeddings of Label Components for Sequence Labeling: A Case Study of
Fine-grained Named Entity Recognition [41.60109880213463]
We propose to integrate label component information as embeddings into models.
We demonstrate that the proposed method improves performance, especially for instances with low-frequency labels.
arXiv Detail & Related papers (2020-06-02T03:47:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.