Modulated Contrast for Versatile Image Synthesis
- URL: http://arxiv.org/abs/2203.09333v1
- Date: Thu, 17 Mar 2022 14:03:46 GMT
- Title: Modulated Contrast for Versatile Image Synthesis
- Authors: Fangneng Zhan, Jiahui Zhang, Yingchen Yu, Rongliang Wu, Shijian Lu
- Abstract summary: MoNCE is a versatile metric that introduces image contrast to learn a calibrated metric for the perception of multifaceted inter-image distances.
We introduce optimal transport in MoNCE to modulate the pushing force of negative samples collaboratively across multiple contrastive objectives.
- Score: 60.304183493234376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Perceiving the similarity between images has been a long-standing and
fundamental problem underlying various visual generation tasks. Predominant
approaches measure the inter-image distance by computing pointwise absolute
deviations, which tends to estimate the median of instance distributions and
leads to blurs and artifacts in the generated images. This paper presents
MoNCE, a versatile metric that introduces image contrast to learn a calibrated
metric for the perception of multifaceted inter-image distances. Unlike vanilla
contrast which indiscriminately pushes negative samples from the anchor
regardless of their similarity, we propose to re-weight the pushing force of
negative samples adaptively according to their similarity to the anchor, which
facilitates the contrastive learning from informative negative samples. Since
multiple patch-level contrastive objectives are involved in image distance
measurement, we introduce optimal transport in MoNCE to modulate the pushing
force of negative samples collaboratively across multiple contrastive
objectives. Extensive experiments over multiple image translation tasks show
that the proposed MoNCE outperforms various prevailing metrics substantially.
The code is available at https://github.com/fnzhan/MoNCE.
Related papers
- Adaptive Multi-head Contrastive Learning [44.163227964513695]
In contrastive learning, two views of an original image, generated by different augmentations, are considered a positive pair.
A single similarity measure, provided by a lone projection head, evaluates positive and negative sample pairs.
Our approach, Adaptive Multi-Head Contrastive Learning (AMCL), can be applied to and experimentally enhances several popular contrastive learning methods.
arXiv Detail & Related papers (2023-10-09T11:08:34Z) - Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment
for Markup-to-Image Generation [15.411325887412413]
This paper proposes a novel model named "Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment" (FSA-CDM)
FSA-CDM introduces contrastive positive/negative samples into the diffusion model to boost performance for markup-to-image generation.
Experiments are conducted on four benchmark datasets from different domains.
arXiv Detail & Related papers (2023-08-02T13:43:03Z) - Synthetic Hard Negative Samples for Contrastive Learning [8.776888865665024]
This paper proposes a novel feature-level method, namely sampling synthetic hard negative samples for contrastive learning (SSCL)
We generate more and harder negative samples by mixing negative samples, and then sample them by controlling the contrast of anchor sample with the other negative samples.
Our proposed method improves the classification performance on different image datasets and can be readily integrated into existing methods.
arXiv Detail & Related papers (2023-04-06T09:54:35Z) - Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness.
We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z) - Exploring Negatives in Contrastive Learning for Unpaired Image-to-Image
Translation [12.754320302262533]
We introduce a new negative Pruning technology for Unpaired image-to-image Translation (PUT) by sparsifying and ranking the patches.
The proposed algorithm is efficient, flexible and enables the model to learn essential information between corresponding patches stably.
arXiv Detail & Related papers (2022-04-23T08:31:18Z) - Object-aware Contrastive Learning for Debiased Scene Representation [74.30741492814327]
We develop a novel object-aware contrastive learning framework that localizes objects in a self-supervised manner.
We also introduce two data augmentations based on ContraCAM, object-aware random crop and background mixup, which reduce contextual and background biases during contrastive self-supervised learning.
arXiv Detail & Related papers (2021-07-30T19:24:07Z) - Contrastive Attraction and Contrastive Repulsion for Representation
Learning [131.72147978462348]
Contrastive learning (CL) methods learn data representations in a self-supervision manner, where the encoder contrasts each positive sample over multiple negative samples.
Recent CL methods have achieved promising results when pretrained on large-scale datasets, such as ImageNet.
We propose a doubly CL strategy that separately compares positive and negative samples within their own groups, and then proceeds with a contrast between positive and negative groups.
arXiv Detail & Related papers (2021-05-08T17:25:08Z) - DivCo: Diverse Conditional Image Synthesis via Contrastive Generative
Adversarial Network [70.12848483302915]
Conditional generative adversarial networks (cGANs) target at diverse images given the input conditions and latent codes.
Recent MSGAN tried to encourage the diversity of the generated image but only considers "negative" relations between the image pairs.
We propose a novel DivCo framework to properly constrain both "positive" and "negative" relations between the generated images specified in the latent space.
arXiv Detail & Related papers (2021-03-14T11:11:15Z) - Conditional Negative Sampling for Contrastive Learning of Visual
Representations [19.136685699971864]
We show that choosing difficult negatives, or those more similar to the current instance, can yield stronger representations.
We introduce a family of mutual information estimators that sample negatives conditionally -- in a "ring" around each positive.
We prove that these estimators lower-bound mutual information, with higher bias but lower variance than NCE.
arXiv Detail & Related papers (2020-10-05T14:17:32Z) - Delving into Inter-Image Invariance for Unsupervised Visual
Representations [108.33534231219464]
We present a study to better understand the role of inter-image invariance learning.
Online labels converge faster than offline labels.
Semi-hard negative samples are more reliable and unbiased than hard negative samples.
arXiv Detail & Related papers (2020-08-26T17:44:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.