Moment Alignment: Unifying Gradient and Hessian Matching for Domain Generalization
- URL: http://arxiv.org/abs/2506.07378v1
- Date: Mon, 09 Jun 2025 02:51:36 GMT
- Title: Moment Alignment: Unifying Gradient and Hessian Matching for Domain Generalization
- Authors: Yuen Chen, Haozhe Si, Guojun Zhang, Han Zhao,
- Abstract summary: Domain generalization (DG) seeks to develop models that generalize well to unseen target domains.<n>One line of research in DG focuses on aligning domain-level gradients and Hessians to enhance generalization.<n>We introduce textbfClosed-Form textbfMoment textbfAlignment (CMA), a novel DG algorithm that aligns domain-level gradients and Hessians in closed-form.
- Score: 13.021311628351423
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Domain generalization (DG) seeks to develop models that generalize well to unseen target domains, addressing the prevalent issue of distribution shifts in real-world applications. One line of research in DG focuses on aligning domain-level gradients and Hessians to enhance generalization. However, existing methods are computationally inefficient and the underlying principles of these approaches are not well understood. In this paper, we develop the theory of moment alignment for DG. Grounded in \textit{transfer measure}, a principled framework for quantifying generalizability between two domains, we first extend the definition of transfer measure to domain generalization that includes multiple source domains and establish a target error bound. Then, we prove that aligning derivatives across domains improves transfer measure both when the feature extractor induces an invariant optimal predictor across domains and when it does not. Notably, moment alignment provides a unifying understanding of Invariant Risk Minimization, gradient matching, and Hessian matching, three previously disconnected approaches to DG. We further connect feature moments and derivatives of the classifier head, and establish the duality between feature learning and classifier fitting. Building upon our theory, we introduce \textbf{C}losed-Form \textbf{M}oment \textbf{A}lignment (CMA), a novel DG algorithm that aligns domain-level gradients and Hessians in closed-form. Our method overcomes the computational inefficiencies of existing gradient and Hessian-based techniques by eliminating the need for repeated backpropagation or sampling-based Hessian estimation. We validate the efficacy of our approach through two sets of experiments: linear probing and full fine-tuning. CMA demonstrates superior performance in both settings compared to Empirical Risk Minimization and state-of-the-art algorithms.
Related papers
- Group-wise Scaling and Orthogonal Decomposition for Domain-Invariant Feature Extraction in Face Anti-Spoofing [7.902884193437407]
We propose a novel DGFAS framework that jointly aligns weights and biases through Feature Orthogonal Decomposition (FOD) and Group-wise Scaling Risk Minimization (GS-RM)<n>Our approach achieves state-of-the-art performance, consistently improving accuracy, reducing bias misalignment, and enhancing stability on unseen target domains.
arXiv Detail & Related papers (2025-07-05T11:20:19Z) - Gradient-Guided Annealing for Domain Generalization [5.124256074746721]
Gradient-Guided Annealing (GGA) algorithm is proposed to improve domain generalization effectiveness.<n>The efficacy of GGA is evaluated on five widely accepted and challenging image classification domain generalization benchmarks.
arXiv Detail & Related papers (2025-02-27T15:01:55Z) - Constrained Maximum Cross-Domain Likelihood for Domain Generalization [14.91361835243516]
Domain generalization aims to learn a generalizable model on multiple source domains, which is expected to perform well on unseen test domains.
In this paper, we propose a novel domain generalization method, which minimizes the KL-divergence between posterior distributions from different domains.
Experiments on four standard benchmark datasets, i.e., Digits-DG, PACS, Office-Home and miniDomainNet, highlight the superior performance of our method.
arXiv Detail & Related papers (2022-10-09T03:41:02Z) - Relation Matters: Foreground-aware Graph-based Relational Reasoning for
Domain Adaptive Object Detection [81.07378219410182]
We propose a new and general framework for DomainD, named Foreground-aware Graph-based Reasoning (FGRR)
FGRR incorporates graph structures into the detection pipeline to explicitly model the intra- and inter-domain foreground object relations.
Empirical results demonstrate that the proposed FGRR exceeds the state-of-the-art on four DomainD benchmarks.
arXiv Detail & Related papers (2022-06-06T05:12:48Z) - Compound Domain Generalization via Meta-Knowledge Encoding [55.22920476224671]
We introduce Style-induced Domain-specific Normalization (SDNorm) to re-normalize the multi-modal underlying distributions.
We harness the prototype representations, the centroids of classes, to perform relational modeling in the embedding space.
Experiments on four standard Domain Generalization benchmarks reveal that COMEN exceeds the state-of-the-art performance without the need of domain supervision.
arXiv Detail & Related papers (2022-03-24T11:54:59Z) - Towards Principled Disentanglement for Domain Generalization [90.9891372499545]
A fundamental challenge for machine learning models is generalizing to out-of-distribution (OOD) data.
We first formalize the OOD generalization problem as constrained optimization, called Disentanglement-constrained Domain Generalization (DDG)
Based on the transformation, we propose a primal-dual algorithm for joint representation disentanglement and domain generalization.
arXiv Detail & Related papers (2021-11-27T07:36:32Z) - Model-Based Domain Generalization [96.84818110323518]
We propose a novel approach for the domain generalization problem called Model-Based Domain Generalization.
Our algorithms beat the current state-of-the-art methods on the very-recently-proposed WILDS benchmark by up to 20 percentage points.
arXiv Detail & Related papers (2021-02-23T00:59:02Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z) - Discriminative Feature Alignment: Improving Transferability of
Unsupervised Domain Adaptation by Gaussian-guided Latent Alignment [27.671964294233756]
In this study, we focus on the unsupervised domain adaptation problem where an approximate inference model is to be learned from a labeled data domain.
The success of unsupervised domain adaptation largely relies on the cross-domain feature alignment.
We introduce a Gaussian-guided latent alignment approach to align the latent feature distributions of the two domains under the guidance of the prior distribution.
In such an indirect way, the distributions over the samples from the two domains will be constructed on a common feature space, i.e., the space of the prior.
arXiv Detail & Related papers (2020-06-23T05:33:54Z) - Bi-Directional Generation for Unsupervised Domain Adaptation [61.73001005378002]
Unsupervised domain adaptation facilitates the unlabeled target domain relying on well-established source domain information.
Conventional methods forcefully reducing the domain discrepancy in the latent space will result in the destruction of intrinsic data structure.
We propose a Bi-Directional Generation domain adaptation model with consistent classifiers interpolating two intermediate domains to bridge source and target domains.
arXiv Detail & Related papers (2020-02-12T09:45:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.