Related papers: CLaC at SemEval-2023 Task 2: Comparing Span-Prediction and Sequence-Labeling approaches for NER

CLaC at SemEval-2023 Task 2: Comparing Span-Prediction and Sequence-Labeling approaches for NER

URL: http://arxiv.org/abs/2305.03845v1
Date: Fri, 5 May 2023 20:49:40 GMT
Title: CLaC at SemEval-2023 Task 2: Comparing Span-Prediction and Sequence-Labeling approaches for NER
Authors: Harsh Verma, Sabine Bergler
Abstract summary: This paper summarizes the CLaC submission for the MultiCoNER 2 task. We compare two popular approaches for NER, namely Sequence Labeling and Span Prediction. We find that our best Span Prediction system performs slightly better than our best Sequence Labeling system on test data.
Score: 0.554780083433538
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper summarizes the CLaC submission for the MultiCoNER 2 task which concerns the recognition of complex, fine-grained named entities. We compare two popular approaches for NER, namely Sequence Labeling and Span Prediction. We find that our best Span Prediction system performs slightly better than our best Sequence Labeling system on test data. Moreover, we find that using the larger version of XLM RoBERTa significantly improves performance. Post-competition experiments show that Span Prediction and Sequence Labeling approaches improve when they use special input tokens (<s> and </s>) of XLM-RoBERTa. The code for training all models, preprocessing, and post-processing is available at https://github.com/harshshredding/semeval2023-multiconer-paper.

Related papers

A Debiased Nearest Neighbors Framework for Multi-Label Text Classification [13.30576550077694]
We introduce a DEbiased Nearest Neighbors (DENN) framework for Multi-Label Text Classification (MLTC) To address embedding alignment bias, we propose a debiased contrastive learning strategy, enhancing neighbor consistency on label co-occurrence. For confidence estimation bias, we present a debiased confidence estimation strategy, improving the adaptive combination of predictions from $k$NN and inductive binary classifications.
arXiv Detail & Related papers (2024-08-06T14:00:23Z)
OTSeq2Set: An Optimal Transport Enhanced Sequence-to-Set Model for Extreme Multi-label Text Classification [9.990725102725916]
Extreme multi-label text classification (XMTC) is the task of finding the most relevant subset labels from a large-scale label collection. We propose an autoregressive sequence-to-set model for XMTC tasks named OTSeq2Set. Our model generates predictions in student-forcing scheme and is trained by a loss function based on bipartite matching.
arXiv Detail & Related papers (2022-10-26T07:25:18Z)
Progressive End-to-End Object Detection in Crowded Scenes [96.92416613336096]
Previous query-based detectors suffer from two drawbacks: first, multiple predictions will be inferred for a single object, typically in crowded scenes; second, the performance saturates as the depth of the decoding stage increases. We propose a progressive predicting method to address the above issues. Specifically, we first select accepted queries to generate true positive predictions, then refine the rest noisy queries according to the previously accepted predictions. Experiments show that our method can significantly boost the performance of query-based detectors in crowded scenes.
arXiv Detail & Related papers (2022-03-15T06:12:00Z)
Gated recurrent units and temporal convolutional network for multilabel classification [122.84638446560663]
This work proposes a new ensemble method for managing multilabel classification. The core of the proposed approach combines a set of gated recurrent units and temporal convolutional neural networks trained with variants of the Adam gradients optimization approach.
arXiv Detail & Related papers (2021-10-09T00:00:16Z)
Post-OCR Document Correction with large Ensembles of Character Sequence Models [0.3359875577705537]
We propose a novel method to correct documents already processed with Optical Character Recognition (OCR) systems. The main contribution of this paper is a set of strategies to accurately process strings much longer than the ones used to train the sequence model. We test our method on nine languages of the ICDAR 2019 competition on post-OCR text correction and achieve a new state-of-the-art performance in five of them.
arXiv Detail & Related papers (2021-09-13T19:05:02Z)
Enhancing Label Correlation Feedback in Multi-Label Text Classification via Multi-Task Learning [6.1538971100140145]
We introduce a novel approach with multi-task learning to enhance label correlation feedback. We propose two auxiliary label co-occurrence prediction tasks to enhance label correlation learning.
arXiv Detail & Related papers (2021-06-06T12:26:14Z)
SpanNer: Named Entity Re-/Recognition as Span Prediction [62.66148736099347]
span prediction model is used for named entity recognition. We experimentally implement 154 systems on 11 datasets, covering three languages. Our model has been deployed into the ExplainaBoard platform.
arXiv Detail & Related papers (2021-06-01T17:11:42Z)
Accelerating BERT Inference for Sequence Labeling via Early-Exit [65.7292767360083]
We extend the recent successful early-exit mechanism to accelerate the inference of PTMs for sequence labeling tasks. We also propose a token-level early-exit mechanism that allows partial tokens to exit early at different layers. Our approach can save up to 66%-75% inference cost with minimal performance degradation.
arXiv Detail & Related papers (2021-05-28T14:39:26Z)
Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios. For dataset bias due to different samplers, we propose shifted batch normalization. Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z)
NLRG at SemEval-2021 Task 5: Toxic Spans Detection Leveraging BERT-based Token Classification and Span Prediction Techniques [0.6850683267295249]
In this paper, we explore simple versions of Token Classification or Span Prediction approaches. We use BERT-based models -- BERT, RoBERTa, and SpanBERT for both approaches. To this end, we investigate results on four hybrid approaches -- Multi-Span, Span+Token, LSTM-CRF, and a combination of predicted offsets using union/intersection.
arXiv Detail & Related papers (2021-02-24T12:30:09Z)
BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision [49.42215511723874]
We propose a new computational framework -- BOND -- to improve the prediction performance of NER models. Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels. In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance.
arXiv Detail & Related papers (2020-06-28T04:55:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.