Related papers: A Contrastive Objective for Learning Disentangled Representations

A Contrastive Objective for Learning Disentangled Representations

URL: http://arxiv.org/abs/2203.11284v1
Date: Mon, 21 Mar 2022 18:56:36 GMT
Title: A Contrastive Objective for Learning Disentangled Representations
Authors: Jonathan Kahana, Yedid Hoshen
Abstract summary: Learning representations of images that are invariant to sensitive or unwanted attributes is important for many tasks including bias removal and cross domain retrieval. We present a new approach, proposing a new domain-wise contrastive objective for ensuring invariant representations. In an extensive evaluation, our method convincingly outperforms the state-of-the-art in terms of representation invariance, representation informativeness, and training speed.
Score: 32.36217153362305
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning representations of images that are invariant to sensitive or unwanted attributes is important for many tasks including bias removal and cross domain retrieval. Here, our objective is to learn representations that are invariant to the domain (sensitive attribute) for which labels are provided, while being informative over all other image attributes, which are unlabeled. We present a new approach, proposing a new domain-wise contrastive objective for ensuring invariant representations. This objective crucially restricts negative image pairs to be drawn from the same domain, which enforces domain invariance whereas the standard contrastive objective does not. This domain-wise objective is insufficient on its own as it suffers from shortcut solutions resulting in feature suppression. We overcome this issue by a combination of a reconstruction constraint, image augmentations and initialization with pre-trained weights. Our analysis shows that the choice of augmentations is important, and that a misguided choice of augmentations can harm the invariance and informativeness objectives. In an extensive evaluation, our method convincingly outperforms the state-of-the-art in terms of representation invariance, representation informativeness, and training speed. Furthermore, we find that in some cases our method can achieve excellent results even without the reconstruction constraint, leading to a much faster and resource efficient training.

Related papers

DQE-CIR: Distinctive Query Embeddings through Learnable Attribute Weights and Target Relative Negative Sampling in Composed Image Retrieval [53.482391830683014]
Composed image retrieval (CIR) addresses the task of retrieving a target image by jointly interpreting a reference image and a modification text that specifies the intended change.<n>Most existing methods are still built upon contrastive learning frameworks that treat the ground truth image as the only positive instance and all remaining images as negatives.<n>We propose distinctive query embeddings through learnable attribute weights and target relative negative sampling.
arXiv Detail & Related papers (2026-03-04T13:17:44Z)
Pixel-Level Domain Adaptation: A New Perspective for Enhancing Weakly Supervised Semantic Segmentation [13.948425538725138]
We propose a Pixel-Level Domain Adaptation (PLDA) method to encourage the model in learning pixel-wise domain-invariant features. We experimentally demonstrate the effectiveness of our approach under a wide range of settings.
arXiv Detail & Related papers (2024-08-04T14:14:54Z)
Unity in Diversity: Multi-expert Knowledge Confrontation and Collaboration for Generalizable Vehicle Re-identification [60.20318058777603]
Generalizable vehicle re-identification (ReID) seeks to develop models that can adapt to unknown target domains without the need for fine-tuning or retraining. Previous works have mainly focused on extracting domain-invariant features by aligning data distributions between source domains. We propose a two-stage Multi-expert Knowledge Confrontation and Collaboration (MiKeCoCo) method to solve this unique problem.
arXiv Detail & Related papers (2024-07-10T04:06:39Z)
Regularizing Self-training for Unsupervised Domain Adaptation via Structural Constraints [14.593782939242121]
We propose to incorporate structural cues from auxiliary modalities, such as depth, to regularise conventional self-training objectives. Specifically, we introduce a contrastive pixel-level objectness constraint that pulls the pixel representations within a region of an object instance closer. We show that our regularizer significantly improves top performing self-training methods in various UDA benchmarks for semantic segmentation.
arXiv Detail & Related papers (2023-04-29T00:12:26Z)
SGDR: Semantic-guided Disentangled Representation for Unsupervised Cross-modality Medical Image Segmentation [5.090366802287405]
We propose a novel framework, called semantic-guided disentangled representation (SGDR), to exact semantically meaningful feature for segmentation task. We validated our method on two public datasets and experiment results show that our approach outperforms the state of the art methods on two evaluation metrics by a significant margin.
arXiv Detail & Related papers (2022-03-26T08:31:00Z)
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency [66.49995436833667]
We focus on the semi-supervised segmentation problem where only a small set of labeled data is provided with a much larger collection of totally unlabeled images. A preferred high-level representation should capture the contextual information while not losing self-awareness. We present the Directional Contrastive Loss (DC Loss) to accomplish the consistency in a pixel-to-pixel manner.
arXiv Detail & Related papers (2021-06-27T03:42:40Z)
AFAN: Augmented Feature Alignment Network for Cross-Domain Object Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications. We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training. Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z)
Few-Shot Learning with Part Discovery and Augmentation from Unlabeled Images [79.34600869202373]
We show that inductive bias can be learned from a flat collection of unlabeled images, and instantiated as transferable representations among seen and unseen classes. Specifically, we propose a novel part-based self-supervised representation learning scheme to learn transferable representations. Our method yields impressive results, outperforming the previous best unsupervised methods by 7.74% and 9.24%.
arXiv Detail & Related papers (2021-05-25T12:22:11Z)
What Should Not Be Contrastive in Contrastive Learning [110.14159883496859]
We introduce a contrastive learning framework which does not require prior knowledge of specific, task-dependent invariances. Our model learns to capture varying and invariant factors for visual representations by constructing separate embedding spaces. We use a multi-head network with a shared backbone which captures information across each augmentation and alone outperforms all baselines on downstream tasks.
arXiv Detail & Related papers (2020-08-13T03:02:32Z)
Target Consistency for Domain Adaptation: when Robustness meets Transferability [8.189696720657247]
Learning Invariant Representations has been successfully applied for reconciling a source and a target domain for Unsupervised Domain Adaptation. We show that the cluster assumption is violated in the target domain despite being maintained in the source domain. Our new approach results in a significant improvement, on both image classification and segmentation benchmarks.
arXiv Detail & Related papers (2020-06-25T09:13:00Z)
Domain-aware Visual Bias Eliminating for Generalized Zero-Shot Learning [150.42959029611657]
Domain-aware Visual Bias Eliminating (DVBE) network constructs two complementary visual representations. For unseen images, we automatically search an optimal semantic-visual alignment architecture.
arXiv Detail & Related papers (2020-03-30T08:17:04Z)
Fairness by Learning Orthogonal Disentangled Representations [50.82638766862974]
We propose a novel disentanglement approach to invariant representation problem. We enforce the meaningful representation to be agnostic to sensitive information by entropy. The proposed approach is evaluated on five publicly available datasets.
arXiv Detail & Related papers (2020-03-12T11:09:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.