Phase Matching for Out-of-Distribution Generalization
- URL: http://arxiv.org/abs/2307.12622v6
- Date: Wed, 28 Aug 2024 03:45:08 GMT
- Title: Phase Matching for Out-of-Distribution Generalization
- Authors: Chengming Hu, Yeqian Du, Rui Wang, Hao Chen, Congcong Zhu,
- Abstract summary: This paper is dedicated to clarifying the relationships between Domain Generalization (DG) and the frequency components.
We propose a Phase Matching approach, termed PhaMa, to address DG problems.
Experiments on multiple benchmarks demonstrate that our proposed method achieves state-of-the-art performance.
- Score: 9.786356781007122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Fourier transform, an explicit decomposition method for visual signals, has been employed to explain the out-of-distribution generalization behaviors of Deep Neural Networks (DNNs). Previous studies indicate that the amplitude spectrum is susceptible to the disturbance caused by distribution shifts, whereas the phase spectrum preserves highly-structured spatial information that is crucial for robust visual representation learning. Inspired by this insight, this paper is dedicated to clarifying the relationships between Domain Generalization (DG) and the frequency components. Specifically, we provide distribution analysis and empirical experiments for the frequency components. Based on these observations, we propose a Phase Matching approach, termed PhaMa, to address DG problems. To this end, PhaMa introduces perturbations on the amplitude spectrum and establishes spatial relationships to match the phase components with patch contrastive learning. Experiments on multiple benchmarks demonstrate that our proposed method achieves state-of-the-art performance in domain generalization and out-of-distribution robustness tasks. Beyond vanilla analysis and experiments, we further clarify the relationships between the Fourier components and DG problems by introducing a Fourier-based Structural Causal Model (SCM).
Related papers
- A spectral method for multi-view subspace learning using the product of projections [0.16385815610837165]
We provide an easy-to-use and scalable estimation algorithm for multi-view data.
In particular, we employ rotational bootstrap and random matrix theory to partition the observed spectrum into joint, individual, and noise subspaces.
In simulations, our method estimates joint and individual subspaces more accurately than existing approaches.
arXiv Detail & Related papers (2024-10-24T19:51:55Z) - Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement [58.9768112704998]
Disentangled representation learning strives to extract the intrinsic factors within observed data.
We introduce a new perspective and framework, demonstrating that diffusion models with cross-attention can serve as a powerful inductive bias.
This is the first work to reveal the potent disentanglement capability of diffusion models with cross-attention, requiring no complex designs.
arXiv Detail & Related papers (2024-02-15T05:07:54Z) - Understanding the Expressivity and Trainability of Fourier Neural Operator: A Mean-Field Perspective [31.030338985431722]
We explore the expressivity and trainability of the Fourier Neural Operator (FNO)
We examine the ordered-chaos phase transition of the network based on the weight distribution.
We find a connection between expressivity and trainability: the ordered and chaotic phases correspond to regions of vanishing and exploding gradients.
arXiv Detail & Related papers (2023-10-10T07:43:41Z) - HoloNets: Spectral Convolutions do extend to Directed Graphs [59.851175771106625]
Conventional wisdom dictates that spectral convolutional networks may only be deployed on undirected graphs.
Here we show this traditional reliance on the graph Fourier transform to be superfluous.
We provide a frequency-response interpretation of newly developed filters, investigate the influence of the basis used to express filters and discuss the interplay with characteristic operators on which networks are based.
arXiv Detail & Related papers (2023-10-03T17:42:09Z) - A Fourier-based Framework for Domain Generalization [82.54650565298418]
Domain generalization aims at tackling this problem by learning transferable knowledge from multiple source domains in order to generalize to unseen target domains.
This paper introduces a novel Fourier-based perspective for domain generalization.
Experiments on three benchmarks have demonstrated that the proposed method is able to achieve state-of-the-arts performance for domain generalization.
arXiv Detail & Related papers (2021-05-24T06:50:30Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z) - Theory inspired deep network for instantaneous-frequency extraction and
signal components recovery from discrete blind-source data [1.6758573326215689]
This paper is concerned with the inverse problem of recovering the unknown signal components, along with extraction of their frequencies.
None of the existing decomposition methods and algorithms is capable of solving this inverse problem.
We propose a synthesis of a deep neural network, based directly on a discrete sample set, that may be non-uniformly sampled, of the blind-source signal.
arXiv Detail & Related papers (2020-01-31T18:54:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.