Related papers: Robust Natural Language Processing: Recent Advances, Challenges, and Future Directions

Robust Natural Language Processing: Recent Advances, Challenges, and Future Directions

URL: http://arxiv.org/abs/2201.00768v1
Date: Mon, 3 Jan 2022 17:17:11 GMT
Title: Robust Natural Language Processing: Recent Advances, Challenges, and Future Directions
Authors: Marwan Omar, Soohyeon Choi, DaeHun Nyang, and David Mohaisen
Abstract summary: We present a structured overview of NLP robustness research by summarizing the literature in a systemic way across various dimensions. We then take a deep-dive into the various dimensions of robustness, across techniques, metrics, embeddings, and benchmarks.
Score: 4.409836695738517
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent natural language processing (NLP) techniques have accomplished high performance on benchmark datasets, primarily due to the significant improvement in the performance of deep learning. The advances in the research community have led to great enhancements in state-of-the-art production systems for NLP tasks, such as virtual assistants, speech recognition, and sentiment analysis. However, such NLP systems still often fail when tested with adversarial attacks. The initial lack of robustness exposed troubling gaps in current models' language understanding capabilities, creating problems when NLP systems are deployed in real life. In this paper, we present a structured overview of NLP robustness research by summarizing the literature in a systemic way across various dimensions. We then take a deep-dive into the various dimensions of robustness, across techniques, metrics, embeddings, and benchmarks. Finally, we argue that robustness should be multi-dimensional, provide insights into current research, identify gaps in the literature to suggest directions worth pursuing to address these gaps.

Related papers

Foundations and Evaluations in NLP [1.0619039878979954]
This memoir explores two fundamental aspects of Natural Language Processing (NLP): the creation of linguistic resources and the evaluation of NLP system performance. My work has focused on developing a morpheme-based annotation scheme for the Korean language that captures linguistic properties from morphology to semantics. I have proposed a novel evaluation framework, the jp-algorithm, which introduces an alignment-based method to address challenges in preprocessing tasks.
arXiv Detail & Related papers (2025-04-02T04:14:03Z)
Advancements and Challenges in Bangla Question Answering Models: A Comprehensive Review [0.0]
This paper presents a comprehensive review of seven research articles that contribute to the progress in this domain. The papers introduce innovative methods like using LSTM-based models with attention mechanisms, context-based QA systems, and deep learning techniques based on prior knowledge. Despite the progress made, several challenges remain, including the lack of well-annotated data, the absence of high-quality reading comprehension datasets, and difficulties in understanding the meaning of words in context.
arXiv Detail & Related papers (2024-12-16T14:42:26Z)
Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research [0.0]
In recent years, advancements in natural language processing (NLP) have been fueled by deep learning techniques. BERT and GPT-3, trained on vast amounts of data, have revolutionized language understanding and generation. This paper delves into the current landscape and future prospects of large-scale model-based NLP.
arXiv Detail & Related papers (2024-02-25T09:32:17Z)
Exploring Precision and Recall to assess the quality and diversity of LLMs [82.21278402856079]
We introduce a novel evaluation framework for Large Language Models (LLMs) such as textscLlama-2 and textscMistral. This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora.
arXiv Detail & Related papers (2024-02-16T13:53:26Z)
Natural Language Processing for Dialects of a Language: A Survey [56.93337350526933]
State-of-the-art natural language processing (NLP) models are trained on massive training corpora, and report a superlative performance on evaluation datasets. This survey delves into an important attribute of these datasets: the dialect of a language. Motivated by the performance degradation of NLP models for dialectic datasets and its implications for the equity of language technologies, we survey past research in NLP for dialects in terms of datasets, and approaches.
arXiv Detail & Related papers (2024-01-11T03:04:38Z)
Beyond Turing: A Comparative Analysis of Approaches for Detecting Machine-Generated Text [1.919654267936118]
Traditional shallow learning, Language Model (LM) fine-tuning, and Multilingual Model fine-tuning are evaluated. Results reveal considerable differences in performance across methods. This study paves the way for future research aimed at creating robust and highly discriminative models.
arXiv Detail & Related papers (2023-11-21T06:23:38Z)
Meta Learning for Natural Language Processing: A Survey [88.58260839196019]
Deep learning has been the mainstream technique in natural language processing (NLP) area. Deep learning requires many labeled data and is less generalizable across domains. Meta-learning is an arising field in machine learning studying approaches to learn better algorithms.
arXiv Detail & Related papers (2022-05-03T13:58:38Z)
Faithfulness in Natural Language Generation: A Systematic Survey of Analysis, Evaluation and Optimization Methods [48.47413103662829]
Natural Language Generation (NLG) has made great progress in recent years due to the development of deep learning techniques such as pre-trained language models. However, the faithfulness problem that the generated text usually contains unfaithful or non-factual information has become the biggest challenge.
arXiv Detail & Related papers (2022-03-10T08:28:32Z)
Measure and Improve Robustness in NLP Models: A Survey [23.515869499536237]
robustness has been separately explored in applications like vision and NLP, with various definitions, evaluation and mitigation strategies in multiple lines of research. We first connect multiple definitions of robustness, then unify various lines of work on identifying robustness failures and evaluating models' robustness. We present mitigation strategies that are data-driven, model-driven, and inductive-prior-based, with a more systematic view of how to effectively improve robustness in NLP models.
arXiv Detail & Related papers (2021-12-15T18:02:04Z)
Learning to Selectively Learn for Weakly-supervised Paraphrase Generation [81.65399115750054]
We propose a novel approach to generate high-quality paraphrases with weak supervision data. Specifically, we tackle the weakly-supervised paraphrase generation problem by:. obtaining abundant weakly-labeled parallel sentences via retrieval-based pseudo paraphrase expansion. We demonstrate that our approach achieves significant improvements over existing unsupervised approaches, and is even comparable in performance with supervised state-of-the-arts.
arXiv Detail & Related papers (2021-09-25T23:31:13Z)
Artificial Text Detection via Examining the Topology of Attention Maps [58.46367297712477]
We propose three novel types of interpretable topological features for this task based on Topological Data Analysis (TDA) We empirically show that the features derived from the BERT model outperform count- and neural-based baselines up to 10% on three common datasets. The probing analysis of the features reveals their sensitivity to the surface and syntactic properties.
arXiv Detail & Related papers (2021-09-10T12:13:45Z)
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures [0.0]
Natural Language Processing models have achieved phenomenal success in linguistic and semantic tasks. Recent NLP architectures have utilized concepts of transfer learning, pruning, quantization, and knowledge distillation to achieve moderate model sizes. Knowledge Retrievers have been built to extricate explicit data documents from a large corpus of databases with greater efficiency and accuracy.
arXiv Detail & Related papers (2021-03-23T22:38:20Z)
Continual Learning for Natural Language Generation in Task-oriented Dialog Systems [72.92029584113676]
Natural language generation (NLG) is an essential component of task-oriented dialog systems. We study NLG in a "continual learning" setting to expand its knowledge to new domains or functionalities incrementally. The major challenge towards this goal is catastrophic forgetting, meaning that a continually trained model tends to forget the knowledge it has learned before.
arXiv Detail & Related papers (2020-10-02T10:32:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.