Related papers: ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER

ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER

URL: http://arxiv.org/abs/2306.00928v1
Date: Thu, 1 Jun 2023 17:33:04 GMT
Title: ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER
Authors: Sreyan Ghosh and Utkarsh Tyagi and Manan Suri and Sonal Kumar and S Ramaneswaran and Dinesh Manocha
Abstract summary: We present ACLM Attention-map aware keyword selection for Conditional Language Model fine-tuning. ACLM alleviates the context-entity mismatch issue, a problem existing NER data augmentation techniques suffer from. We demonstrate the effectiveness of ACLM both qualitatively and quantitatively on monolingual, cross-lingual, and multilingual complex NER.
Score: 47.32935969127478
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Complex Named Entity Recognition (NER) is the task of detecting linguistically complex named entities in low-context text. In this paper, we present ACLM Attention-map aware keyword selection for Conditional Language Model fine-tuning), a novel data augmentation approach based on conditional generation to address the data scarcity problem in low-resource complex NER. ACLM alleviates the context-entity mismatch issue, a problem existing NER data augmentation techniques suffer from and often generates incoherent augmentations by placing complex named entities in the wrong context. ACLM builds on BART and is optimized on a novel text reconstruction or denoising task - we use selective masking (aided by attention maps) to retain the named entities and certain keywords in the input sentence that provide contextually relevant additional knowledge or hints about the named entities. Compared with other data augmentation strategies, ACLM can generate more diverse and coherent augmentations preserving the true word sense of complex entities in the sentence. We demonstrate the effectiveness of ACLM both qualitatively and quantitatively on monolingual, cross-lingual, and multilingual complex NER across various low-resource settings. ACLM outperforms all our neural baselines by a significant margin (1%-36%). In addition, we demonstrate the application of ACLM to other domains that suffer from data scarcity (e.g., biomedical). In practice, ACLM generates more effective and factual augmentations for these domains than prior methods. Code: https://github.com/Sreyan88/ACLM

Related papers

Improving Named Entity Transcription with Contextual LLM-based Revision [14.078146578977599]
We introduce a large language model (LLM) revision mechanism to revise incorrect named entities in automatic speech recognition predictions.<n>Our proposed technique achieves up to 30% relative WER reduction for named entities.
arXiv Detail & Related papers (2025-06-12T14:53:48Z)
LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic Context [4.444835399672951]
We propose a novel GER approach that targets rare words and incorporates phonetic information.<n> Experimental results show that our method not only improves the correction of rare words but also reduces the WER and CER.
arXiv Detail & Related papers (2025-05-23T02:54:52Z)
PICASO: Permutation-Invariant Context Composition with State Space Models [98.91198288025117]
State Space Models (SSMs) offer a promising solution by allowing a database of contexts to be mapped onto fixed-dimensional states. We propose a simple mathematical relation derived from SSM dynamics to compose multiple states into one that efficiently approximates the effect of concatenating raw context tokens. We evaluate our resulting method on WikiText and MSMARCO in both zero-shot and fine-tuned settings, and show that we can match the strongest performing baseline while enjoying on average 5.4x speedup.
arXiv Detail & Related papers (2025-02-24T19:48:00Z)
"I've Heard of You!": Generate Spoken Named Entity Recognition Data for Unseen Entities [59.22329574700317]
Spoken named entity recognition (NER) aims to identify named entities from speech. New named entities appear every day, however, annotating their Spoken NER data is costly. We propose a method for generating Spoken NER data based on a named entity dictionary (NED) to reduce costs.
arXiv Detail & Related papers (2024-12-26T07:43:18Z)
Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation [73.9145653659403]
We show that Generative Error Correction models struggle to generalize beyond the specific types of errors encountered during training. We propose DARAG, a novel approach designed to improve GEC for ASR in in-domain (ID) and OOD scenarios. Our approach is simple, scalable, and both domain- and language-agnostic.
arXiv Detail & Related papers (2024-10-17T04:00:29Z)
DynamicNER: A Dynamic, Multilingual, and Fine-Grained Dataset for LLM-based Named Entity Recognition [21.1223074396331]
We propose DynamicNER, the first NER dataset specifically designed for Large Language Models (LLMs) It is multi-lingual and multi-granular, covering 8 languages and 155 entity types, with corpus spanning multiple specialized domains. In response to the limitations demonstrated by existing LLM-based methods during DynamicNER testing, we develop CascadeNER, a novel NER method based on a two-stage strategy and lightweight LLMs.
arXiv Detail & Related papers (2024-09-17T09:32:12Z)
Contextualization of ASR with LLM using phonetic retrieval-based augmentation [8.823596907304944]
We propose a retrieval-based solution to contextualize large language models (LLMs) We first let the LLM detect named entities in speech without any context, then use this named entity as a query to retrieve phonetically similar named entities from a personal database. In a voice assistant task, our solution achieved up to 30.2% relative word error rate reduction and 73.6% relative named entity error rate reduction.
arXiv Detail & Related papers (2024-09-11T18:32:38Z)
DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph [70.79413606968814]
We introduce Dynamic Evaluation of LLMs via Adaptive Reasoning Graph Evolvement (DARG) to dynamically extend current benchmarks with controlled complexity and diversity. Specifically, we first extract the reasoning graphs of data points in current benchmarks and then perturb the reasoning graphs to generate novel testing data. Such newly generated test samples can have different levels of complexity while maintaining linguistic diversity similar to the original benchmarks.
arXiv Detail & Related papers (2024-06-25T04:27:53Z)
ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models [25.68491572293656]
Large Language Models fall short in structured knowledge extraction tasks such as named entity recognition. This paper explores an innovative, cost-efficient strategy to harness LLMs with modest NER capabilities for producing superior NER datasets.
arXiv Detail & Related papers (2024-03-17T06:12:43Z)
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition [65.95847272465124]
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR) In this work, we extend the benchmark to noisy conditions and investigate if we can teach LLMs to perform denoising for GER. Experiments on various latest LLMs demonstrate our approach achieves a new breakthrough with up to 53.9% correction improvement in terms of word error rate.
arXiv Detail & Related papers (2024-01-19T01:29:27Z)
MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition [36.868805760086886]
This dataset aims to tackle the following practical challenges in NER: (i) effective handling of fine-grained classes that include complex entities like movie titles, and (ii) performance degradation due to noise generated from typing mistakes or OCR errors. The dataset is compiled from open resources like Wikipedia and Wikidata, and is publicly available.
arXiv Detail & Related papers (2023-10-20T01:14:46Z)
GPT-NER: Named Entity Recognition via Large Language Models [58.609582116612934]
GPT-NER transforms the sequence labeling task to a generation task that can be easily adapted by Language Models. We find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce. This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.
arXiv Detail & Related papers (2023-04-20T16:17:26Z)
Dynamic Named Entity Recognition [5.9401550252715865]
We introduce a new task: Dynamic Named Entity Recognition (DNER) DNER provides a framework to better evaluate the ability of algorithms to extract entities by exploiting the context. We evaluate baseline models and present experiments reflecting issues and research axes related to this novel task.
arXiv Detail & Related papers (2023-02-16T15:50:02Z)
UM6P-CS at SemEval-2022 Task 11: Enhancing Multilingual and Code-Mixed Complex Named Entity Recognition via Pseudo Labels using Multilingual Transformer [7.270980742378389]
We introduce our submitted system to the Multilingual Complex Named Entity Recognition (MultiCoNER) shared task. We approach the complex NER for multilingual and code-mixed queries, by relying on the contextualized representation provided by the multilingual Transformer XLM-RoBERTa. Our proposed system is ranked 6th and 8th in the multilingual and code-mixed MultiCoNER's tracks respectively.
arXiv Detail & Related papers (2022-04-28T14:07:06Z)
Probing Linguistic Features of Sentence-Level Representations in Neural Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE) We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets. We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.