Related papers: SOI Matters: Analyzing Multi-Setting Training Dynamics in Pretrained Language Models via Subsets of Interest

SOI Matters: Analyzing Multi-Setting Training Dynamics in Pretrained Language Models via Subsets of Interest

URL: http://arxiv.org/abs/2507.15236v1
Date: Mon, 21 Jul 2025 04:43:21 GMT
Title: SOI Matters: Analyzing Multi-Setting Training Dynamics in Pretrained Language Models via Subsets of Interest
Authors: Shayan Vassef, Amirhossein Dabiriaghdam, Mohammadreza Bakhtiari, Yadollah Yaghoobzadeh,
Abstract summary: This work investigates the impact of multi-task, multi-lingual, and multi-source learning approaches on the robustness and performance of pretrained language models.<n>Subsets of Interest (SOI) identifies six distinct learning behavior patterns during training, including forgettable examples, unlearned examples, and always correct examples.<n>Our results demonstrate that multi-source learning consistently improves out-of-distribution performance by up to 7%, while multi-task learning shows mixed results with notable gains in similar task combinations.
Score: 5.882817862856554
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This work investigates the impact of multi-task, multi-lingual, and multi-source learning approaches on the robustness and performance of pretrained language models. To enhance this analysis, we introduce Subsets of Interest (SOI), a novel categorization framework that identifies six distinct learning behavior patterns during training, including forgettable examples, unlearned examples, and always correct examples. Through SOI transition heatmaps and dataset cartography visualization, we analyze how examples shift between these categories when transitioning from single-setting to multi-setting configurations. We perform comprehensive experiments across three parallel comparisons: multi-task vs. single-task learning using English tasks (entailment, paraphrase, sentiment), multi-source vs. single-source learning using sentiment analysis datasets, and multi-lingual vs. single-lingual learning using intent classification in French, English, and Persian. Our results demonstrate that multi-source learning consistently improves out-of-distribution performance by up to 7%, while multi-task learning shows mixed results with notable gains in similar task combinations. We further introduce a two-stage fine-tuning approach where the second stage leverages SOI-based subset selection to achieve additional performance improvements. These findings provide new insights into training dynamics and offer practical approaches for optimizing multi-setting language model performance.

Related papers

Multi-Scale and Multi-Objective Optimization for Cross-Lingual Aspect-Based Sentiment Analysis [0.808899919316203]
We propose a novel framework, Multi-Scale and Multi-Objective optimization (MSMO) for cross-lingual ABSA.<n>We achieve cross-lingual sentence-level and aspect-level alignment, aligning features of aspect terms in different contextual environments.<n>Results show that MSMO significantly enhances cross-lingual ABSA by achieving state-of-the-art performance across multiple languages and models.
arXiv Detail & Related papers (2025-02-19T13:43:33Z)
Optimizing Multi-Task Learning for Enhanced Performance in Large Language Models [5.930799903736776]
The proposed multi-task learning model outperforms other comparison models in terms of text classification accuracy and ROUGE value of summary generation.<n>The framework based on multi-task learning is expected to play a greater role in practical applications across fields.
arXiv Detail & Related papers (2024-12-09T06:47:42Z)
P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs [84.24644520272835]
We introduce P-MMEval, a large-scale benchmark covering effective fundamental and capability-specialized datasets.<n>P-MMEval delivers consistent language coverage across various datasets and provides parallel samples.<n>We conduct extensive experiments on representative multilingual model series to compare performances across models and tasks.
arXiv Detail & Related papers (2024-11-14T01:29:36Z)
CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment [38.35458193262633]
English-centric models are usually suboptimal in other languages.<n>We propose a novel approach called CrossIn, which utilizes a mixed composition of cross-lingual instruction tuning data.
arXiv Detail & Related papers (2024-04-18T06:20:50Z)
Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings. Our model operates on parallel data in $N$ languages. We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z)
Specializing Multilingual Language Models: An Empirical Study [50.7526245872855]
Contextualized word representations from pretrained multilingual language models have become the de facto standard for addressing natural language tasks. For languages rarely or never seen by these models, directly using such models often results in suboptimal representation or use of data.
arXiv Detail & Related papers (2021-06-16T18:13:55Z)
Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting. Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z)
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training. We propose a new pre-training task based on contrastive learning. By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z)
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization [128.37244072182506]
Cross-lingual TRansfer Evaluation of Multilinguals XTREME is a benchmark for evaluating the cross-lingual generalization capabilities of multilingual representations across 40 languages and 9 tasks. We demonstrate that while models tested on English reach human performance on many tasks, there is still a sizable gap in the performance of cross-lingually transferred models.
arXiv Detail & Related papers (2020-03-24T19:09:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.