Related papers: Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning

Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning

URL: http://arxiv.org/abs/2511.07198v2
Date: Wed, 12 Nov 2025 01:58:32 GMT
Title: Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning
Authors: Hua Ye, Siyuan Chen, Haoliang Zhang, Weihao Luo, Yanbin Li, Xuan Zhang,
Abstract summary: Large language models (LLMs) demonstrate impressive generalization abilities, yet adapting them effectively across multiple heterogeneous domains remains challenging.<n>We propose a partition-based multi-stage fine-tuning framework designed to exploit inter-domain synergies while minimizing negative transfer.<n>Our approach strategically partitions domains into subsets (stages) by balancing domain discrepancy, synergy, and model capacity constraints.
Score: 9.97195966127976
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) demonstrate impressive generalization abilities, yet adapting them effectively across multiple heterogeneous domains remains challenging due to inter-domain interference. To overcome this challenge, we propose a partition-based multi-stage fine-tuning framework designed to exploit inter-domain synergies while minimizing negative transfer. Our approach strategically partitions domains into subsets (stages) by balancing domain discrepancy, synergy, and model capacity constraints. We theoretically analyze the proposed framework and derive novel generalization bounds that justify our partitioning strategy. Extensive empirical evaluations on various language understanding tasks show that our method consistently outperforms state-of-the-art baselines.

Related papers

Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models [67.45032003041399]
We propose a novel Multi-Paradigm Collaborative Attack (MPCAttack) framework to boost the transferability of adversarial examples against MLLMs.<n>MPCO adaptively balances the importance of different paradigm representations and guides the global optimisation.<n>Our solution consistently outperforms state-of-the-art methods in both targeted and untargeted attacks on open-source and closed-source MLLMs.
arXiv Detail & Related papers (2026-03-05T06:01:26Z)
Reasoning-Driven Multimodal LLM for Domain Generalization [72.00754603114187]
We study the role of reasoning in domain generalization using DomainBed-Reasoning dataset.<n>We propose RD-MLDG, a framework with two components: MTCT (Multi-Task Cross-Training) and SARR (Self-Aligned Reasoning Regularization)<n>Experiments on standard DomainBed datasets demonstrate that RD-MLDG achieves complementary state-of-the-art performances.
arXiv Detail & Related papers (2026-02-27T08:10:06Z)
Open-Vocabulary Domain Generalization in Urban-Scene Segmentation [83.15573353963235]
Domain Generalization in Semantic Domain (DG-SS) aims to enable segmentation models to perform robustly in unseen environments.<n>Recent progress in Vision-Language Models (VLMs) has advanced Open-Vocabulary Semantic (OV-SS) by enabling models to recognize a broader range of concepts.<n>Yet, these models remain sensitive to domain shifts and struggle to maintain robustness when deployed in unseen environments.<n>We propose S2-Corr, a state-space-driven text-image correlation refinement mechanism that produces more consistent text-image correlations under distribution changes.
arXiv Detail & Related papers (2026-02-21T14:32:27Z)
Modality-Collaborative Low-Rank Decomposers for Few-Shot Video Domain Adaptation [74.16390314862801]
We study the challenging task of Few-Shot Video Domain Adaptation (FSVDA)<n>We introduce a novel framework of Modality-Collaborative LowRank Decomposers (MC-LRD) to decompose modality-unique and modality-shared features.<n>Our model achieves significant improvements over existing methods.
arXiv Detail & Related papers (2025-11-24T03:09:59Z)
Enhancing Generalization in Chain of Thought Reasoning for Smaller Models [5.297025364137428]
Chain-of-Thought (CoT) reasoning in smaller language models is a challenging natural language process problem.<n>Existing CoT knowledge distillation methods often suffer from overly conservative adaptability in smaller LLMs.<n>We propose PRADA, a principled fine-tuning framework that integrates diverse CoT domains.
arXiv Detail & Related papers (2025-01-16T19:23:11Z)
Enhancing Multimodal Emotion Recognition through Multi-Granularity Cross-Modal Alignment [10.278127492434297]
This paper introduces a Multi-Granularity Cross-Modal Alignment (MGCMA) framework, distinguished by its comprehensive approach encompassing distribution-based, instance-based, and token-based alignment modules.<n>Our experiments on IEMOCAP demonstrate that our proposed method outperforms current state-of-the-art techniques.
arXiv Detail & Related papers (2024-12-30T09:30:41Z)
From Deterministic to Probabilistic: A Novel Perspective on Domain Generalization for Medical Image Segmentation [1.93061220186624]
We propose an innovative framework that enhances data representation quality through probabilistic modeling and contrastive learning.<n>Specifically, we combine deterministic features with uncertainty modeling to capture comprehensive feature distributions.<n>We show that the proposed framework significantly improves segmentation performance, providing a robust solution to domain generalization challenges in medical image segmentation.
arXiv Detail & Related papers (2024-12-07T07:41:04Z)
Unified Language-driven Zero-shot Domain Adaptation [55.64088594551629]
Unified Language-driven Zero-shot Domain Adaptation (ULDA) is a novel task setting. It enables a single model to adapt to diverse target domains without explicit domain-ID knowledge.
arXiv Detail & Related papers (2024-04-10T16:44:11Z)
Cross Contrasting Feature Perturbation for Domain Generalization [11.863319505696184]
Domain generalization aims to learn a robust model from source domains that generalize well on unseen target domains. Recent studies focus on generating novel domain samples or features to diversify distributions complementary to source domains. We propose an online one-stage Cross Contrasting Feature Perturbation framework to simulate domain shift.
arXiv Detail & Related papers (2023-07-24T03:27:41Z)
Joint covariate-alignment and concept-alignment: a framework for domain generalization [28.391072289529053]
We propose a novel domain generalization framework based on a new upper bound to the risk on the unseen domain. Our numerical results show that the proposed methods perform as well as or better than the state-of-the-art for domain generalization on several data sets.
arXiv Detail & Related papers (2022-08-01T14:39:35Z)
Variational Disentanglement for Domain Generalization [68.85458536180437]
We propose to tackle the problem of domain generalization by delivering an effective framework named Variational Disentanglement Network (VDN) VDN is capable of disentangling the domain-specific features and task-specific features, where the task-specific features are expected to be better generalized to unseen but related test data.
arXiv Detail & Related papers (2021-09-13T09:55:32Z)
Structured Latent Embeddings for Recognizing Unseen Classes in Unseen Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains. Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z)
Model-Based Domain Generalization [96.84818110323518]
We propose a novel approach for the domain generalization problem called Model-Based Domain Generalization. Our algorithms beat the current state-of-the-art methods on the very-recently-proposed WILDS benchmark by up to 20 percentage points.
arXiv Detail & Related papers (2021-02-23T00:59:02Z)
Cross-Domain Grouping and Alignment for Domain Adaptive Semantic Segmentation [74.3349233035632]
Existing techniques to adapt semantic segmentation networks across the source and target domains within deep convolutional neural networks (CNNs) do not consider an inter-class variation within the target domain itself or estimated category. We introduce a learnable clustering module, and a novel domain adaptation framework called cross-domain grouping and alignment. Our method consistently boosts the adaptation performance in semantic segmentation, outperforming the state-of-the-arts on various domain adaptation settings.
arXiv Detail & Related papers (2020-12-15T11:36:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.