Exploring Energy-Based Models for Out-of-Distribution Detection in Dialect Identification
- URL: http://arxiv.org/abs/2406.18067v1
- Date: Wed, 26 Jun 2024 04:52:48 GMT
- Title: Exploring Energy-Based Models for Out-of-Distribution Detection in Dialect Identification
- Authors: Yaqian Hao, Chenguang Hu, Yingying Gao, Shilei Zhang, Junlan Feng,
- Abstract summary: This study introduces a novel margin-enhanced joint energy model (MEJEM) tailored specifically for OOD detection in dialects.
By integrating a generative model and the energy margin loss, our approach aims to enhance the robustness of dialect identification systems.
- Score: 11.437264025117083
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The diverse nature of dialects presents challenges for models trained on specific linguistic patterns, rendering them susceptible to errors when confronted with unseen or out-of-distribution (OOD) data. This study introduces a novel margin-enhanced joint energy model (MEJEM) tailored specifically for OOD detection in dialects. By integrating a generative model and the energy margin loss, our approach aims to enhance the robustness of dialect identification systems. Furthermore, we explore two OOD scores for OOD dialect detection, and our findings conclusively demonstrate that the energy score outperforms the softmax score. Leveraging Sharpness-Aware Minimization to optimize the training process of the joint model, we enhance model generalization by minimizing both loss and sharpness. Experiments conducted on dialect identification tasks validate the efficacy of Energy-Based Models and provide valuable insights into their performance.
Related papers
- Self-Calibrated Tuning of Vision-Language Models for Out-of-Distribution Detection [24.557227100200215]
Out-of-distribution (OOD) detection is crucial for deploying reliable machine learning models in open-world applications.
Recent advances in CLIP-based OOD detection have shown promising results via regularizing prompt tuning with OOD features extracted from ID data.
We propose a novel framework, namely, Self-Calibrated Tuning (SCT), to mitigate this problem for effective OOD detection with only the given few-shot ID data.
arXiv Detail & Related papers (2024-11-05T02:29:16Z) - Can OOD Object Detectors Learn from Foundation Models? [56.03404530594071]
Out-of-distribution (OOD) object detection is a challenging task due to the absence of open-set OOD data.
Inspired by recent advancements in text-to-image generative models, we study the potential of generative models trained on large-scale open-set data to synthesize OOD samples.
We introduce SyncOOD, a simple data curation method that capitalizes on the capabilities of large foundation models.
arXiv Detail & Related papers (2024-09-08T17:28:22Z) - Deep Metric Learning-Based Out-of-Distribution Detection with Synthetic Outlier Exposure [0.0]
We propose a label-mixup approach to generate synthetic OOD data using Denoising Diffusion Probabilistic Models (DDPMs)
In the experiments, we found that metric learning-based loss functions perform better than the softmax.
Our approach outperforms strong baselines in conventional OOD detection metrics.
arXiv Detail & Related papers (2024-05-01T16:58:22Z) - Evaluating Large Language Models Using Contrast Sets: An Experimental Approach [0.0]
We introduce an innovative technique for generating a contrast set for the Stanford Natural Language Inference dataset.
Our strategy involves the automated substitution of verbs, adverbs, and adjectives with their synonyms to preserve the original meaning of sentences.
This method aims to assess whether a model's performance is based on genuine language comprehension or simply on pattern recognition.
arXiv Detail & Related papers (2024-04-02T02:03:28Z) - DPP-Based Adversarial Prompt Searching for Lanugage Models [56.73828162194457]
Auto-regressive Selective Replacement Ascent (ASRA) is a discrete optimization algorithm that selects prompts based on both quality and similarity with determinantal point process (DPP)
Experimental results on six different pre-trained language models demonstrate the efficacy of ASRA for eliciting toxic content.
arXiv Detail & Related papers (2024-03-01T05:28:06Z) - Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization [11.140366256534474]
Existing vision-language models exhibit strong generalization on a variety of visual domains and tasks.
We propose a novel approach OGEN to improve the OOD GENeralization of finetuned models.
Specifically, a class-conditional feature generator is introduced to synthesize OOD features using just the class name of any unknown class.
arXiv Detail & Related papers (2024-01-29T06:57:48Z) - Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection
Capability [70.72426887518517]
Out-of-distribution (OOD) detection is an indispensable aspect of secure AI when deploying machine learning models in real-world applications.
We propose a novel method, Unleashing Mask, which aims to restore the OOD discriminative capabilities of the well-trained model with ID data.
Our method utilizes a mask to figure out the memorized atypical samples, and then finetune the model or prune it with the introduced mask to forget them.
arXiv Detail & Related papers (2023-06-06T14:23:34Z) - Discover, Explanation, Improvement: An Automatic Slice Detection
Framework for Natural Language Processing [72.14557106085284]
slice detection models (SDM) automatically identify underperforming groups of datapoints.
This paper proposes a benchmark named "Discover, Explain, improve (DEIM)" for classification NLP tasks.
Our evaluation shows that Edisa can accurately select error-prone datapoints with informative semantic features.
arXiv Detail & Related papers (2022-11-08T19:00:00Z) - NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task
Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data.
The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z) - Joint Energy-based Model Training for Better Calibrated Natural Language
Understanding Models [61.768082640087]
We explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders for natural language understanding tasks.
Experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines.
arXiv Detail & Related papers (2021-01-18T01:41:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.