Related papers: Deep Perceptual Similarity is Adaptable to Ambiguous Contexts

Deep Perceptual Similarity is Adaptable to Ambiguous Contexts

URL: http://arxiv.org/abs/2304.02265v2
Date: Fri, 12 May 2023 14:04:04 GMT
Title: Deep Perceptual Similarity is Adaptable to Ambiguous Contexts
Authors: Gustav Grund Pihlgren, Fredrik Sandin, Marcus Liwicki
Abstract summary: The concept of image similarity is ambiguous, and images can be similar in one context and not in another. This work explores the ability of deep perceptual similarity (DPS) metrics to adapt to a given context. The adapted metrics are evaluated on a perceptual similarity dataset to evaluate if adapting to a ranking affects their prior performance.
Score: 1.6217405839281338
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The concept of image similarity is ambiguous, and images can be similar in one context and not in another. This ambiguity motivates the creation of metrics for specific contexts. This work explores the ability of deep perceptual similarity (DPS) metrics to adapt to a given context. DPS metrics use the deep features of neural networks for comparing images. These metrics have been successful on datasets that leverage the average human perception in limited settings. But the question remains if they could be adapted to specific similarity contexts. No single metric can suit all similarity contexts, and previous rule-based metrics are labor-intensive to rewrite for new contexts. On the other hand, DPS metrics use neural networks that might be retrained for each context. However, retraining networks takes resources and might ruin performance on previous tasks. This work examines the adaptability of DPS metrics by training ImageNet pretrained CNNs to measure similarity according to given contexts. Contexts are created by randomly ranking six image distortions. Distortions later in the ranking are considered more disruptive to similarity when applied to an image for that context. This also gives insight into whether the pretrained features capture different similarity contexts. The adapted metrics are evaluated on a perceptual similarity dataset to evaluate if adapting to a ranking affects their prior performance. The findings show that DPS metrics can be adapted with high performance. While the adapted metrics have difficulties with the same contexts as baselines, performance is improved in 99% of cases. Finally, it is shown that the adaption is not significantly detrimental to prior performance on perceptual similarity. The implementation of this work is available online: https://github.com/LTU-Machine-Learning/Analysis-of-Deep-Perceptual-Loss-Networks

Related papers

CSIM: A Copula-based similarity index sensitive to local changes for Image quality assessment [2.3874115898130865]
Image similarity metrics play an important role in computer vision applications, as they are used in image processing, computer vision and machine learning. Existing metrics, such as PSNR, MSE, SSIM, ISSM and FSIM, often face limitations in terms of either speed, complexity or sensitivity to small changes in images. A novel image similarity metric, namely CSIM, that combines real-time while being sensitive to subtle image variations is investigated in this paper.
arXiv Detail & Related papers (2024-10-02T10:46:05Z)
Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling [58.50618448027103]
Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning. This paper explores the differences across various CLIP-trained vision backbones. Method achieves a remarkable increase in accuracy of up to 39.1% over the best single backbone.
arXiv Detail & Related papers (2024-05-27T12:59:35Z)
LipSim: A Provably Robust Perceptual Similarity Metric [56.03417732498859]
We show the vulnerability of state-of-the-art perceptual similarity metrics based on an ensemble of ViT-based feature extractors to adversarial attacks. We then propose a framework to train a robust perceptual similarity metric called LipSim with provable guarantees. LipSim provides guarded areas around each data point and certificates for all perturbations within an $ell$ ball.
arXiv Detail & Related papers (2023-10-27T16:59:51Z)
A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks [52.09243852066406]
Adversarial Converging Time Score (ACTS) measures the converging time as an adversarial robustness metric. We validate the effectiveness and generalization of the proposed ACTS metric against different adversarial attacks on the large-scale ImageNet dataset.
arXiv Detail & Related papers (2023-10-10T09:39:38Z)
Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge. We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem. Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z)
R-LPIPS: An Adversarially Robust Perceptual Similarity Metric [71.33812578529006]
We propose the Robust Learned Perceptual Image Patch Similarity (R-LPIPS) metric. R-LPIPS is a new metric that leverages adversarially trained deep features. We demonstrate the superiority of R-LPIPS compared to the classical LPIPS metric.
arXiv Detail & Related papers (2023-07-27T19:11:31Z)
DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data [43.247597420676044]
Current perceptual similarity metrics operate at the level of pixels and patches. These metrics compare images in terms of their low-level colors and textures, but fail to capture mid-level similarities and differences in image layout, object pose, and semantic content. We develop a perceptual metric that assesses images holistically.
arXiv Detail & Related papers (2023-06-15T17:59:50Z)
Equivariant Similarity for Vision-Language Foundation Models [134.77524524140168]
This study focuses on the multimodal similarity function that is not only the major training objective but also the core delivery to support downstream tasks. We propose EqSim, a regularization loss that can be efficiently calculated from any two matched training pairs. Compared to the existing evaluation sets, EqBen is the first to focus on "visual-minimal change"
arXiv Detail & Related papers (2023-03-25T13:22:56Z)
Identifying and Mitigating Flaws of Deep Perceptual Similarity Metrics [1.484528358552186]
This work investigates the benefits and flaws of the Deep Perceptual Similarity (DPS) metric. The metrics are analyzed in-depth to understand the strengths and weaknesses of the metrics. This work contributes with new insights into the flaws of DPS, and further suggests improvements to the metrics.
arXiv Detail & Related papers (2022-07-06T08:28:39Z)
Learning an Adaptation Function to Assess Image Visual Similarities [0.0]
We focus here on the specific task of learning visual image similarities when analogy matters. We propose to compare different supervised, semi-supervised and self-supervised networks, pre-trained on distinct scales and contents datasets. Our experiments conducted on the Totally Looks Like image dataset highlight the interest of our method, by increasing the retrieval scores of the best model @1 by 2.25x.
arXiv Detail & Related papers (2022-06-03T07:15:00Z)
Adaptive Label Smoothing [1.3198689566654107]
We present a novel approach to classification that combines the ideas of objectness and label smoothing during training. We show extensive results using ImageNet to demonstrate that CNNs trained using adaptive label smoothing are much less likely to be overconfident in their predictions.
arXiv Detail & Related papers (2020-09-14T13:37:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.