Invariant Representation via Decoupling Style and Spurious Features from Images
- URL: http://arxiv.org/abs/2312.06226v2
- Date: Mon, 1 Apr 2024 06:57:31 GMT
- Title: Invariant Representation via Decoupling Style and Spurious Features from Images
- Authors: Ruimeng Li, Yuanhao Pu, Zhaoyi Li, Hong Xie, Defu Lian,
- Abstract summary: This paper considers the out-of-distribution (OOD) generalization problem under the setting that both style distribution shift and spurious features exist and domain labels are missing.
We propose a structural causal model (SCM) for the image generation process, which captures both style distribution shift and spurious features.
The proposed SCM enables us to design a new framework called IRSS, which can gradually separate style distribution and spurious features from images.
- Score: 27.965593857283316
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper considers the out-of-distribution (OOD) generalization problem under the setting that both style distribution shift and spurious features exist and domain labels are missing. This setting frequently arises in real-world applications and is underlooked because previous approaches mainly handle either of these two factors. The critical challenge is decoupling style and spurious features in the absence of domain labels. To address this challenge, we first propose a structural causal model (SCM) for the image generation process, which captures both style distribution shift and spurious features. The proposed SCM enables us to design a new framework called IRSS, which can gradually separate style distribution and spurious features from images by introducing adversarial neural networks and multi-environment optimization, thus achieving OOD generalization. Moreover, it does not require additional supervision (e.g., domain labels) other than the images and their corresponding labels. Experiments on benchmark datasets demonstrate that IRSS outperforms traditional OOD methods and solves the problem of Invariant risk minimization (IRM) degradation, enabling the extraction of invariant features under distribution shift.
Related papers
- Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion [37.18537753482751]
Conditional Diffusion Relaxing Inversion (CRDI) is designed to enhance distribution diversity in synthetic image generation.
CRDI does not rely on fine-tuning based on only a few samples.
It focuses on reconstructing each target image instance and expanding diversity through few-shot learning.
arXiv Detail & Related papers (2024-07-09T21:58:26Z) - Graphs Generalization under Distribution Shifts [11.963958151023732]
We introduce a novel framework, namely Graph Learning Invariant Domain genERation (GLIDER)
Our model outperforms baseline methods on node-level OOD generalization across domains in distribution shift on node features and topological structures simultaneously.
arXiv Detail & Related papers (2024-03-25T00:15:34Z) - Diagnosing and Rectifying Fake OOD Invariance: A Restructured Causal
Approach [51.012396632595554]
Invariant representation learning (IRL) encourages the prediction from invariant causal features to labels de-confounded from the environments.
Recent theoretical results verified that some causal features recovered by IRLs merely pretend domain-invariantly in the training environments but fail in unseen domains.
We develop an approach based on conditional mutual information with respect to RS-SCM, then rigorously rectify the spurious and fake invariant effects.
arXiv Detail & Related papers (2023-12-15T12:58:05Z) - Exploring Invariant Representation for Visible-Infrared Person
Re-Identification [77.06940947765406]
Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy.
In this paper, we address the problem from both image-level and feature-level in an end-to-end hybrid learning framework named robust feature mining network (RFM)
Experiment results on two standard cross-spectral person re-identification datasets, RegDB and SYSU-MM01, have demonstrated state-of-the-art performance.
arXiv Detail & Related papers (2023-02-02T05:24:50Z) - Hierarchical Similarity Learning for Aliasing Suppression Image
Super-Resolution [64.15915577164894]
A hierarchical image super-resolution network (HSRNet) is proposed to suppress the influence of aliasing.
HSRNet achieves better quantitative and visual performance than other works, and remits the aliasing more effectively.
arXiv Detail & Related papers (2022-06-07T14:55:32Z) - BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR [52.78253400327191]
BDA-SketRet is a novel framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs.
Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw exhibit sharp improvements over the literature.
arXiv Detail & Related papers (2022-01-17T18:45:55Z) - Blind Image Super-Resolution via Contrastive Representation Learning [41.17072720686262]
We design a contrastive representation learning network that focuses on blind SR of images with multi-modal and spatially variant distributions.
We show that the proposed CRL-SR can handle multi-modal and spatially variant degradation effectively under blind settings.
It also outperforms state-of-the-art SR methods qualitatively and quantitatively.
arXiv Detail & Related papers (2021-07-01T19:34:23Z) - Few-Shot Domain Expansion for Face Anti-Spoofing [28.622220790439055]
Face anti-spoofing (FAS) is an indispensable and widely used module in face recognition systems.
We identify and address a more practical problem: Few-Shot Domain Expansion for Face Anti-Spoofing (FSDE-FAS)
arXiv Detail & Related papers (2021-06-27T07:38:50Z) - Style Normalization and Restitution for DomainGeneralization and
Adaptation [88.86865069583149]
An effective domain generalizable model is expected to learn feature representations that are both generalizable and discriminative.
In this paper, we design a novel Style Normalization and Restitution module (SNR) to ensure both high generalization and discrimination capability of the networks.
arXiv Detail & Related papers (2021-01-03T09:01:39Z) - Multi-Scale Cascading Network with Compact Feature Learning for
RGB-Infrared Person Re-Identification [35.55895776505113]
Multi-Scale Part-Aware Cascading framework (MSPAC) is formulated by aggregating multi-scale fine-grained features from part to global.
Cross-modality correlations can thus be efficiently explored on salient features for distinctive modality-invariant feature learning.
arXiv Detail & Related papers (2020-12-12T15:39:11Z) - Deep Variational Network Toward Blind Image Restoration [60.45350399661175]
Blind image restoration is a common yet challenging problem in computer vision.
We propose a novel blind image restoration method, aiming to integrate both the advantages of them.
Experiments on two typical blind IR tasks, namely image denoising and super-resolution, demonstrate that the proposed method achieves superior performance over current state-of-the-arts.
arXiv Detail & Related papers (2020-08-25T03:30:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.