Generalization Measures for Zero-Shot Cross-Lingual Transfer
- URL: http://arxiv.org/abs/2404.15928v2
- Date: Sun, 8 Sep 2024 00:54:58 GMT
- Title: Generalization Measures for Zero-Shot Cross-Lingual Transfer
- Authors: Saksham Bassi, Duygu Ataman, Kyunghyun Cho,
- Abstract summary: A model's capacity to generalize its knowledge is crucial to build robust and reliable machine learning systems.
Language model evaluation tasks lack information metrics about model generalization.
We propose a novel and stable algorithm to reliably compute the sharpness of a model optimum that correlates to generalization.
- Score: 40.35113593153817
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A model's capacity to generalize its knowledge to interpret unseen inputs with different characteristics is crucial to build robust and reliable machine learning systems. Language model evaluation tasks lack information metrics about model generalization and their applicability in a new setting is measured using task and language-specific downstream performance, which is often lacking in many languages and tasks. In this paper, we explore a set of efficient and reliable measures that could aid in computing more information related to the generalization capability of language models in cross-lingual zero-shot settings. In addition to traditional measures such as variance in parameters after training and distance from initialization, we also measure the effectiveness of sharpness in loss landscape in capturing the success in cross-lingual transfer and propose a novel and stable algorithm to reliably compute the sharpness of a model optimum that correlates to generalization.
Related papers
- Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.
We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.
Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks.
Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z) - Enhancing Traffic Incident Management with Large Language Models: A Hybrid Machine Learning Approach for Severity Classification [3.674863913115431]
This research showcases the innovative integration of Large Language Models into machine learning for traffic incident management.
By leveraging features generated by modern language models alongside conventional data extracted from incident reports, our research demonstrates improvements in the accuracy of severity classification.
arXiv Detail & Related papers (2024-03-20T12:33:51Z) - LaMPP: Language Models as Probabilistic Priors for Perception and Action [38.07277869107474]
We show how to leverage language models for non-linguistic perception and control tasks.
Our approach casts labeling and decision-making as inference in probabilistic graphical models.
arXiv Detail & Related papers (2023-02-03T15:14:04Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - A Unified Neural Network Model for Readability Assessment with Feature
Projection and Length-Balanced Loss [17.213602354715956]
We propose a BERT-based model with feature projection and length-balanced loss for readability assessment.
Our model achieves state-of-the-art performances on two English benchmark datasets and one dataset of Chinese textbooks.
arXiv Detail & Related papers (2022-10-19T05:33:27Z) - A global analysis of metrics used for measuring performance in natural
language processing [9.433496814327086]
We provide the first large-scale cross-sectional analysis of metrics used for measuring performance in natural language processing.
Results suggest that the large majority of natural language processing metrics currently used have properties that may result in an inadequate reflection of a models' performance.
arXiv Detail & Related papers (2022-04-25T11:41:50Z) - Conditional Bilingual Mutual Information Based Adaptive Training for
Neural Machine Translation [66.23055784400475]
Token-level adaptive training approaches can alleviate the token imbalance problem.
We propose a target-context-aware metric, named conditional bilingual mutual information (CBMI)
CBMI can be efficiently calculated during model training without any pre-specific statistical calculations.
arXiv Detail & Related papers (2022-03-06T12:34:10Z) - Evaluating natural language processing models with generalization
metrics that do not need access to any training or testing data [66.11139091362078]
We provide the first model selection results on large pretrained Transformers from Huggingface using generalization metrics.
Despite their niche status, we find that metrics derived from the heavy-tail (HT) perspective are particularly useful in NLP tasks.
arXiv Detail & Related papers (2022-02-06T20:07:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.