Out-of-Distribution Generalization in Text Classification: Past,
Present, and Future
- URL: http://arxiv.org/abs/2305.14104v1
- Date: Tue, 23 May 2023 14:26:11 GMT
- Title: Out-of-Distribution Generalization in Text Classification: Past,
Present, and Future
- Authors: Linyi Yang, Yaoxiao Song, Xuan Ren, Chenyang Lyu, Yidong Wang,
Lingqiao Liu, Jindong Wang, Jennifer Foster, Yue Zhang
- Abstract summary: Machine learning (ML) systems in natural language processing (NLP) face significant challenges in generalizing to out-of-distribution (OOD) data.
This poses important questions about the robustness of NLP models and their high accuracy, which may be artificially inflated due to their underlying sensitivity to systematic biases.
This paper presents the first comprehensive review of recent progress, methods, and evaluations on this topic.
- Score: 30.581612475530974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) systems in natural language processing (NLP) face
significant challenges in generalizing to out-of-distribution (OOD) data, where
the test distribution differs from the training data distribution. This poses
important questions about the robustness of NLP models and their high accuracy,
which may be artificially inflated due to their underlying sensitivity to
systematic biases. Despite these challenges, there is a lack of comprehensive
surveys on the generalization challenge from an OOD perspective in text
classification. Therefore, this paper aims to fill this gap by presenting the
first comprehensive review of recent progress, methods, and evaluations on this
topic. We furth discuss the challenges involved and potential future research
directions. By providing quick access to existing work, we hope this survey
will encourage future research in this area.
Related papers
- Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations [49.908708778200115]
We are the first to specialize large language models (LLMs) for simulating survey response distributions.
As a testbed, we use country-level results from two global cultural surveys.
We devise a fine-tuning method based on first-token probabilities to minimize divergence between predicted and actual response distributions.
arXiv Detail & Related papers (2025-02-10T21:59:27Z) - Misspellings in Natural Language Processing: A survey [52.419589623702336]
misspellings have become ubiquitous in digital communication.
We reconstruct a history of misspellings as a scientific problem.
We discuss the latest advancements to address the challenge of misspellings in NLP.
arXiv Detail & Related papers (2025-01-28T10:26:04Z) - Advancements and Challenges in Bangla Question Answering Models: A Comprehensive Review [0.0]
This paper presents a comprehensive review of seven research articles that contribute to the progress in this domain.
The papers introduce innovative methods like using LSTM-based models with attention mechanisms, context-based QA systems, and deep learning techniques based on prior knowledge.
Despite the progress made, several challenges remain, including the lack of well-annotated data, the absence of high-quality reading comprehension datasets, and difficulties in understanding the meaning of words in context.
arXiv Detail & Related papers (2024-12-16T14:42:26Z) - A Comprehensive Survey of Bias in LLMs: Current Landscape and Future Directions [0.0]
Large Language Models (LLMs) have revolutionized various applications in natural language processing (NLP) by providing unprecedented text generation, translation, and comprehension capabilities.
Their widespread deployment has brought to light significant concerns regarding biases embedded within these models.
This paper presents a comprehensive survey of biases in LLMs, aiming to provide an extensive review of the types, sources, impacts, and mitigation strategies related to these biases.
arXiv Detail & Related papers (2024-09-24T19:50:38Z) - Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation.
Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z) - How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study [59.13867562744973]
This work systematically assesses LMs' capabilities for out-of-distribution (OOD) scenarios.
We find that the efficacy of such learning paradigms varies with the type of OOD.
Specifically, while ICL excels for domain shifts, prompt-based fine-tuning surpasses for topic shifts.
arXiv Detail & Related papers (2023-09-15T11:15:47Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Robust Visual Question Answering: Datasets, Methods, and Future
Challenges [23.59923999144776]
Visual question answering requires a system to provide an accurate natural language answer given an image and a natural language question.
Previous generic VQA methods often exhibit a tendency to memorize biases present in the training data rather than learning proper behaviors, such as grounding images before predicting answers.
Various datasets and debiasing methods have been proposed to evaluate and enhance the VQA robustness, respectively.
arXiv Detail & Related papers (2023-07-21T10:12:09Z) - A Comprehensive Review of Trends, Applications and Challenges In
Out-of-Distribution Detection [0.76146285961466]
Field of study has emerged, focusing on detecting out-of-distribution data subsets and enabling a more comprehensive generalization.
As many deep learning based models have achieved near-perfect results on benchmark datasets, the need to evaluate these models' reliability and trustworthiness is felt more strongly than ever.
This paper presents a survey that, in addition to reviewing more than 70 papers in this field, presents challenges and directions for future works and offers a unifying look into various types of data shifts and solutions for better generalization.
arXiv Detail & Related papers (2022-09-26T18:13:14Z) - Recent Few-Shot Object Detection Algorithms: A Survey with Performance
Comparison [54.357707168883024]
Few-Shot Object Detection (FSOD) mimics the humans' ability of learning to learn.
FSOD intelligently transfers the learned generic object knowledge from the common heavy-tailed, to the novel long-tailed object classes.
We give an overview of FSOD, including the problem definition, common datasets, and evaluation protocols.
arXiv Detail & Related papers (2022-03-27T04:11:28Z) - Deep Learning meets Liveness Detection: Recent Advancements and
Challenges [3.2011056280404637]
We present a comprehensive survey on the literature related to deep-feature-based FAS methods since 2017.
We cover predominant public datasets for FAS in chronological order, their evolutional progress, and the evaluation criteria.
arXiv Detail & Related papers (2021-12-29T19:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.