Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual
Document Understanding Models
- URL: http://arxiv.org/abs/2306.02623v1
- Date: Mon, 5 Jun 2023 06:50:42 GMT
- Title: Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual
Document Understanding Models
- Authors: Jiabang He, Yi Hu, Lei Wang, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen
- Abstract summary: We develop an out-of-distribution (OOD) benchmark termed Do-GOOD for the fine-Grained analysis on Document image-related tasks.
We then evaluate the robustness and perform a fine-grained analysis of 5 latest VDU pre-trained models and 2 typical OOD generalization algorithms.
- Score: 68.12229916000584
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Numerous pre-training techniques for visual document understanding (VDU) have
recently shown substantial improvements in performance across a wide range of
document tasks. However, these pre-trained VDU models cannot guarantee
continued success when the distribution of test data differs from the
distribution of training data. In this paper, to investigate how robust
existing pre-trained VDU models are to various distribution shifts, we first
develop an out-of-distribution (OOD) benchmark termed Do-GOOD for the
fine-Grained analysis on Document image-related tasks specifically. The Do-GOOD
benchmark defines the underlying mechanisms that result in different
distribution shifts and contains 9 OOD datasets covering 3 VDU related tasks,
e.g., document information extraction, classification and question answering.
We then evaluate the robustness and perform a fine-grained analysis of 5 latest
VDU pre-trained models and 2 typical OOD generalization algorithms on these OOD
datasets. Results from the experiments demonstrate that there is a significant
performance gap between the in-distribution (ID) and OOD settings for document
images, and that fine-grained analysis of distribution shifts can reveal the
brittle nature of existing pre-trained VDU models and OOD generalization
algorithms. The code and datasets for our Do-GOOD benchmark can be found at
https://github.com/MAEHCM/Do-GOOD.
Related papers
- WeiPer: OOD Detection using Weight Perturbations of Class Projections [11.130659240045544]
We introduce perturbations of the class projections in the final fully connected layer which creates a richer representation of the input.
We achieve state-of-the-art OOD detection results across multiple benchmarks of the OpenOOD framework.
arXiv Detail & Related papers (2024-05-27T13:38:28Z) - Gradient-Regularized Out-of-Distribution Detection [28.542499196417214]
One of the challenges for neural networks in real-life applications is the overconfident errors these models make when the data is not from the original training distribution.
We propose the idea of leveraging the information embedded in the gradient of the loss function during training to enable the network to learn a desired OOD score for each sample.
We also develop a novel energy-based sampling method to allow the network to be exposed to more informative OOD samples during the training phase.
arXiv Detail & Related papers (2024-04-18T17:50:23Z) - EAT: Towards Long-Tailed Out-of-Distribution Detection [55.380390767978554]
This paper addresses the challenging task of long-tailed OOD detection.
The main difficulty lies in distinguishing OOD data from samples belonging to the tail classes.
We propose two simple ideas: (1) Expanding the in-distribution class space by introducing multiple abstention classes, and (2) Augmenting the context-limited tail classes by overlaying images onto the context-rich OOD data.
arXiv Detail & Related papers (2023-12-14T13:47:13Z) - ExCeL : Combined Extreme and Collective Logit Information for Enhancing
Out-of-Distribution Detection [9.689089164964484]
ExCeL combines extreme and collective information within the output layer for enhanced accuracy in OOD detection.
We show that ExCeL consistently is among the five top-performing methods out of twenty-one existing post-hoc baselines.
arXiv Detail & Related papers (2023-11-23T14:16:03Z) - Revisiting Out-of-distribution Robustness in NLP: Benchmark, Analysis,
and LLMs Evaluations [111.88727295707454]
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of NLP.
We propose a benchmark construction protocol that ensures clear differentiation and challenging distribution shifts.
We conduct experiments on pre-trained language models for analysis and evaluation of OOD robustness.
arXiv Detail & Related papers (2023-06-07T17:47:03Z) - Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is
All You Need [52.88953913542445]
We find surprisingly that simply using reconstruction-based methods could boost the performance of OOD detection significantly.
We take Masked Image Modeling as a pretext task for our OOD detection framework (MOOD)
arXiv Detail & Related papers (2023-02-06T08:24:41Z) - Towards Realistic Out-of-Distribution Detection: A Novel Evaluation
Framework for Improving Generalization in OOD Detection [14.541761912174799]
This paper presents a novel evaluation framework for Out-of-Distribution (OOD) detection.
It aims to assess the performance of machine learning models in more realistic settings.
arXiv Detail & Related papers (2022-11-20T07:30:15Z) - SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in
Fine-tuned Source Code Models [58.78043959556283]
We study the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods.
Our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
arXiv Detail & Related papers (2022-10-10T16:07:24Z) - An Empirical Study on Distribution Shift Robustness From the Perspective
of Pre-Training and Data Augmentation [91.62129090006745]
This paper studies the distribution shift problem from the perspective of pre-training and data augmentation.
We provide the first comprehensive empirical study focusing on pre-training and data augmentation.
arXiv Detail & Related papers (2022-05-25T13:04:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.