ACLM: A Selective-Denoising based Generative Data Augmentation Approach
for Low-Resource Complex NER
- URL: http://arxiv.org/abs/2306.00928v1
- Date: Thu, 1 Jun 2023 17:33:04 GMT
- Title: ACLM: A Selective-Denoising based Generative Data Augmentation Approach
for Low-Resource Complex NER
- Authors: Sreyan Ghosh and Utkarsh Tyagi and Manan Suri and Sonal Kumar and S
Ramaneswaran and Dinesh Manocha
- Abstract summary: We present ACLM Attention-map aware keyword selection for Conditional Language Model fine-tuning.
ACLM alleviates the context-entity mismatch issue, a problem existing NER data augmentation techniques suffer from.
We demonstrate the effectiveness of ACLM both qualitatively and quantitatively on monolingual, cross-lingual, and multilingual complex NER.
- Score: 47.32935969127478
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Complex Named Entity Recognition (NER) is the task of detecting
linguistically complex named entities in low-context text. In this paper, we
present ACLM Attention-map aware keyword selection for Conditional Language
Model fine-tuning), a novel data augmentation approach based on conditional
generation to address the data scarcity problem in low-resource complex NER.
ACLM alleviates the context-entity mismatch issue, a problem existing NER data
augmentation techniques suffer from and often generates incoherent
augmentations by placing complex named entities in the wrong context. ACLM
builds on BART and is optimized on a novel text reconstruction or denoising
task - we use selective masking (aided by attention maps) to retain the named
entities and certain keywords in the input sentence that provide contextually
relevant additional knowledge or hints about the named entities. Compared with
other data augmentation strategies, ACLM can generate more diverse and coherent
augmentations preserving the true word sense of complex entities in the
sentence. We demonstrate the effectiveness of ACLM both qualitatively and
quantitatively on monolingual, cross-lingual, and multilingual complex NER
across various low-resource settings. ACLM outperforms all our neural baselines
by a significant margin (1%-36%). In addition, we demonstrate the application
of ACLM to other domains that suffer from data scarcity (e.g., biomedical). In
practice, ACLM generates more effective and factual augmentations for these
domains than prior methods. Code: https://github.com/Sreyan88/ACLM
Related papers
- DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph [70.79413606968814]
We introduce Dynamic Evaluation of LLMs via Adaptive Reasoning Graph Evolvement (DARG) to dynamically extend current benchmarks with controlled complexity and diversity.
Specifically, we first extract the reasoning graphs of data points in current benchmarks and then perturb the reasoning graphs to generate novel testing data.
Such newly generated test samples can have different levels of complexity while maintaining linguistic diversity similar to the original benchmarks.
arXiv Detail & Related papers (2024-06-25T04:27:53Z) - ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models [25.68491572293656]
Large Language Models fall short in structured knowledge extraction tasks such as named entity recognition.
This paper explores an innovative, cost-efficient strategy to harness LLMs with modest NER capabilities for producing superior NER datasets.
arXiv Detail & Related papers (2024-03-17T06:12:43Z) - Large Language Models are Efficient Learners of Noise-Robust Speech
Recognition [65.95847272465124]
Recent advances in large language models (LLMs) have promoted generative error correction (GER) for automatic speech recognition (ASR)
In this work, we extend the benchmark to noisy conditions and investigate if we can teach LLMs to perform denoising for GER.
Experiments on various latest LLMs demonstrate our approach achieves a new breakthrough with up to 53.9% correction improvement in terms of word error rate.
arXiv Detail & Related papers (2024-01-19T01:29:27Z) - MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy
Named Entity Recognition [36.868805760086886]
This dataset aims to tackle the following practical challenges in NER: (i) effective handling of fine-grained classes that include complex entities like movie titles, and (ii) performance degradation due to noise generated from typing mistakes or OCR errors.
The dataset is compiled from open resources like Wikipedia and Wikidata, and is publicly available.
arXiv Detail & Related papers (2023-10-20T01:14:46Z) - Named Entity Recognition via Machine Reading Comprehension: A Multi-Task
Learning Approach [50.12455129619845]
Named Entity Recognition (NER) aims to extract and classify entity mentions in the text into pre-defined types.
We propose to incorporate the label dependencies among entity types into a multi-task learning framework for better MRC-based NER.
arXiv Detail & Related papers (2023-09-20T03:15:05Z) - GPT-NER: Named Entity Recognition via Large Language Models [58.609582116612934]
GPT-NER transforms the sequence labeling task to a generation task that can be easily adapted by Language Models.
We find that GPT-NER exhibits a greater ability in the low-resource and few-shot setups, when the amount of training data is extremely scarce.
This demonstrates the capabilities of GPT-NER in real-world NER applications where the number of labeled examples is limited.
arXiv Detail & Related papers (2023-04-20T16:17:26Z) - Dynamic Named Entity Recognition [5.9401550252715865]
We introduce a new task: Dynamic Named Entity Recognition (DNER)
DNER provides a framework to better evaluate the ability of algorithms to extract entities by exploiting the context.
We evaluate baseline models and present experiments reflecting issues and research axes related to this novel task.
arXiv Detail & Related papers (2023-02-16T15:50:02Z) - UM6P-CS at SemEval-2022 Task 11: Enhancing Multilingual and Code-Mixed
Complex Named Entity Recognition via Pseudo Labels using Multilingual
Transformer [7.270980742378389]
We introduce our submitted system to the Multilingual Complex Named Entity Recognition (MultiCoNER) shared task.
We approach the complex NER for multilingual and code-mixed queries, by relying on the contextualized representation provided by the multilingual Transformer XLM-RoBERTa.
Our proposed system is ranked 6th and 8th in the multilingual and code-mixed MultiCoNER's tracks respectively.
arXiv Detail & Related papers (2022-04-28T14:07:06Z) - "What's The Context?" : Long Context NLM Adaptation for ASR Rescoring in
Conversational Agents [13.586996848831543]
We investigate various techniques to incorporate turn based context history into both recurrent (LSTM) and Transformer-XL based NLMs.
For recurrent based NLMs, we explore context carry over mechanism and feature based augmentation.
We adapt our contextual NLM towards user provided on-the-fly speech patterns by leveraging encodings from a large pre-trained masked language model.
arXiv Detail & Related papers (2021-04-21T00:15:21Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.