ACLM: A Selective-Denoising based Generative Data Augmentation Approach
for Low-Resource Complex NER
- URL: http://arxiv.org/abs/2306.00928v1
- Date: Thu, 1 Jun 2023 17:33:04 GMT
- Title: ACLM: A Selective-Denoising based Generative Data Augmentation Approach
for Low-Resource Complex NER
- Authors: Sreyan Ghosh and Utkarsh Tyagi and Manan Suri and Sonal Kumar and S
Ramaneswaran and Dinesh Manocha
- Abstract summary: We present ACLM Attention-map aware keyword selection for Conditional Language Model fine-tuning.
ACLM alleviates the context-entity mismatch issue, a problem existing NER data augmentation techniques suffer from.
We demonstrate the effectiveness of ACLM both qualitatively and quantitatively on monolingual, cross-lingual, and multilingual complex NER.
- Score: 47.32935969127478
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Complex Named Entity Recognition (NER) is the task of detecting
linguistically complex named entities in low-context text. In this paper, we
present ACLM Attention-map aware keyword selection for Conditional Language
Model fine-tuning), a novel data augmentation approach based on conditional
generation to address the data scarcity problem in low-resource complex NER.
ACLM alleviates the context-entity mismatch issue, a problem existing NER data
augmentation techniques suffer from and often generates incoherent
augmentations by placing complex named entities in the wrong context. ACLM
builds on BART and is optimized on a novel text reconstruction or denoising
task - we use selective masking (aided by attention maps) to retain the named
entities and certain keywords in the input sentence that provide contextually
relevant additional knowledge or hints about the named entities. Compared with
other data augmentation strategies, ACLM can generate more diverse and coherent
augmentations preserving the true word sense of complex entities in the
sentence. We demonstrate the effectiveness of ACLM both qualitatively and
quantitatively on monolingual, cross-lingual, and multilingual complex NER
across various low-resource settings. ACLM outperforms all our neural baselines
by a significant margin (1%-36%). In addition, we demonstrate the application
of ACLM to other domains that suffer from data scarcity (e.g., biomedical). In
practice, ACLM generates more effective and factual augmentations for these
domains than prior methods. Code: https://github.com/Sreyan88/ACLM
Related papers
- Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation [73.9145653659403]
We show that Generative Error Correction models struggle to generalize beyond the specific types of errors encountered during training.
We propose DARAG, a novel approach designed to improve GEC for ASR in in-domain (ID) and OOD scenarios.
Our approach is simple, scalable, and both domain- and language-agnostic.
arXiv Detail & Related papers (2024-10-17T04:00:29Z) - Contextualization of ASR with LLM using phonetic retrieval-based augmentation [8.823596907304944]
We propose a retrieval-based solution to contextualize large language models (LLMs)
We first let the LLM detect named entities in speech without any context, then use this named entity as a query to retrieve phonetically similar named entities from a personal database.
In a voice assistant task, our solution achieved up to 30.2% relative word error rate reduction and 73.6% relative named entity error rate reduction.
arXiv Detail & Related papers (2024-09-11T18:32:38Z) - DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph [70.79413606968814]
We introduce Dynamic Evaluation of LLMs via Adaptive Reasoning Graph Evolvement (DARG) to dynamically extend current benchmarks with controlled complexity and diversity.
Specifically, we first extract the reasoning graphs of data points in current benchmarks and then perturb the reasoning graphs to generate novel testing data.
Such newly generated test samples can have different levels of complexity while maintaining linguistic diversity similar to the original benchmarks.
arXiv Detail & Related papers (2024-06-25T04:27:53Z) - ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models [25.68491572293656]
Large Language Models fall short in structured knowledge extraction tasks such as named entity recognition.
This paper explores an innovative, cost-efficient strategy to harness LLMs with modest NER capabilities for producing superior NER datasets.
arXiv Detail & Related papers (2024-03-17T06:12:43Z) - Large Language Models are Efficient Learners of Noise-Robust Speech
Recognition [65.95847272465124]
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR)
In this work, we extend the benchmark to noisy conditions and investigate if we can teach LLMs to perform denoising for GER.
Experiments on various latest LLMs demonstrate our approach achieves a new breakthrough with up to 53.9% correction improvement in terms of word error rate.
arXiv Detail & Related papers (2024-01-19T01:29:27Z) - MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy
Named Entity Recognition [36.868805760086886]
This dataset aims to tackle the following practical challenges in NER: (i) effective handling of fine-grained classes that include complex entities like movie titles, and (ii) performance degradation due to noise generated from typing mistakes or OCR errors.
The dataset is compiled from open resources like Wikipedia and Wikidata, and is publicly available.
arXiv Detail & Related papers (2023-10-20T01:14:46Z) - GPT-NER: Named Entity Recognition via Large Language Models [58.609582116612934]
GPT-NER transforms the sequence labeling task to a generation task that can be easily adapted by Language Models.
We find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce.
This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.
arXiv Detail & Related papers (2023-04-20T16:17:26Z) - Dynamic Named Entity Recognition [5.9401550252715865]
We introduce a new task: Dynamic Named Entity Recognition (DNER)
DNER provides a framework to better evaluate the ability of algorithms to extract entities by exploiting the context.
We evaluate baseline models and present experiments reflecting issues and research axes related to this novel task.
arXiv Detail & Related papers (2023-02-16T15:50:02Z) - UM6P-CS at SemEval-2022 Task 11: Enhancing Multilingual and Code-Mixed
Complex Named Entity Recognition via Pseudo Labels using Multilingual
Transformer [7.270980742378389]
We introduce our submitted system to the Multilingual Complex Named Entity Recognition (MultiCoNER) shared task.
We approach the complex NER for multilingual and code-mixed queries, by relying on the contextualized representation provided by the multilingual Transformer XLM-RoBERTa.
Our proposed system is ranked 6th and 8th in the multilingual and code-mixed MultiCoNER's tracks respectively.
arXiv Detail & Related papers (2022-04-28T14:07:06Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.