Deep Perceptual Similarity is Adaptable to Ambiguous Contexts
- URL: http://arxiv.org/abs/2304.02265v2
- Date: Fri, 12 May 2023 14:04:04 GMT
- Title: Deep Perceptual Similarity is Adaptable to Ambiguous Contexts
- Authors: Gustav Grund Pihlgren, Fredrik Sandin, Marcus Liwicki
- Abstract summary: The concept of image similarity is ambiguous, and images can be similar in one context and not in another.
This work explores the ability of deep perceptual similarity (DPS) metrics to adapt to a given context.
The adapted metrics are evaluated on a perceptual similarity dataset to evaluate if adapting to a ranking affects their prior performance.
- Score: 1.6217405839281338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The concept of image similarity is ambiguous, and images can be similar in
one context and not in another. This ambiguity motivates the creation of
metrics for specific contexts. This work explores the ability of deep
perceptual similarity (DPS) metrics to adapt to a given context. DPS metrics
use the deep features of neural networks for comparing images. These metrics
have been successful on datasets that leverage the average human perception in
limited settings. But the question remains if they could be adapted to specific
similarity contexts. No single metric can suit all similarity contexts, and
previous rule-based metrics are labor-intensive to rewrite for new contexts. On
the other hand, DPS metrics use neural networks that might be retrained for
each context. However, retraining networks takes resources and might ruin
performance on previous tasks. This work examines the adaptability of DPS
metrics by training ImageNet pretrained CNNs to measure similarity according to
given contexts. Contexts are created by randomly ranking six image distortions.
Distortions later in the ranking are considered more disruptive to similarity
when applied to an image for that context. This also gives insight into whether
the pretrained features capture different similarity contexts. The adapted
metrics are evaluated on a perceptual similarity dataset to evaluate if
adapting to a ranking affects their prior performance. The findings show that
DPS metrics can be adapted with high performance. While the adapted metrics
have difficulties with the same contexts as baselines, performance is improved
in 99% of cases. Finally, it is shown that the adaption is not significantly
detrimental to prior performance on perceptual similarity. The implementation
of this work is available online:
https://github.com/LTU-Machine-Learning/Analysis-of-Deep-Perceptual-Loss-Networks
Related papers
- CSIM: A Copula-based similarity index sensitive to local changes for Image quality assessment [2.3874115898130865]
Image similarity metrics play an important role in computer vision applications, as they are used in image processing, computer vision and machine learning.
Existing metrics, such as PSNR, MSE, SSIM, ISSM and FSIM, often face limitations in terms of either speed, complexity or sensitivity to small changes in images.
A novel image similarity metric, namely CSIM, that combines real-time while being sensitive to subtle image variations is investigated in this paper.
arXiv Detail & Related papers (2024-10-02T10:46:05Z) - Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling [58.50618448027103]
Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning.
This paper explores the differences across various CLIP-trained vision backbones.
Method achieves a remarkable increase in accuracy of up to 39.1% over the best single backbone.
arXiv Detail & Related papers (2024-05-27T12:59:35Z) - LipSim: A Provably Robust Perceptual Similarity Metric [56.03417732498859]
We show the vulnerability of state-of-the-art perceptual similarity metrics based on an ensemble of ViT-based feature extractors to adversarial attacks.
We then propose a framework to train a robust perceptual similarity metric called LipSim with provable guarantees.
LipSim provides guarded areas around each data point and certificates for all perturbations within an $ell$ ball.
arXiv Detail & Related papers (2023-10-27T16:59:51Z) - A Geometrical Approach to Evaluate the Adversarial Robustness of Deep
Neural Networks [52.09243852066406]
Adversarial Converging Time Score (ACTS) measures the converging time as an adversarial robustness metric.
We validate the effectiveness and generalization of the proposed ACTS metric against different adversarial attacks on the large-scale ImageNet dataset.
arXiv Detail & Related papers (2023-10-10T09:39:38Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - R-LPIPS: An Adversarially Robust Perceptual Similarity Metric [71.33812578529006]
We propose the Robust Learned Perceptual Image Patch Similarity (R-LPIPS) metric.
R-LPIPS is a new metric that leverages adversarially trained deep features.
We demonstrate the superiority of R-LPIPS compared to the classical LPIPS metric.
arXiv Detail & Related papers (2023-07-27T19:11:31Z) - DreamSim: Learning New Dimensions of Human Visual Similarity using
Synthetic Data [43.247597420676044]
Current perceptual similarity metrics operate at the level of pixels and patches.
These metrics compare images in terms of their low-level colors and textures, but fail to capture mid-level similarities and differences in image layout, object pose, and semantic content.
We develop a perceptual metric that assesses images holistically.
arXiv Detail & Related papers (2023-06-15T17:59:50Z) - Equivariant Similarity for Vision-Language Foundation Models [134.77524524140168]
This study focuses on the multimodal similarity function that is not only the major training objective but also the core delivery to support downstream tasks.
We propose EqSim, a regularization loss that can be efficiently calculated from any two matched training pairs.
Compared to the existing evaluation sets, EqBen is the first to focus on "visual-minimal change"
arXiv Detail & Related papers (2023-03-25T13:22:56Z) - Identifying and Mitigating Flaws of Deep Perceptual Similarity Metrics [1.484528358552186]
This work investigates the benefits and flaws of the Deep Perceptual Similarity (DPS) metric.
The metrics are analyzed in-depth to understand the strengths and weaknesses of the metrics.
This work contributes with new insights into the flaws of DPS, and further suggests improvements to the metrics.
arXiv Detail & Related papers (2022-07-06T08:28:39Z) - Learning an Adaptation Function to Assess Image Visual Similarities [0.0]
We focus here on the specific task of learning visual image similarities when analogy matters.
We propose to compare different supervised, semi-supervised and self-supervised networks, pre-trained on distinct scales and contents datasets.
Our experiments conducted on the Totally Looks Like image dataset highlight the interest of our method, by increasing the retrieval scores of the best model @1 by 2.25x.
arXiv Detail & Related papers (2022-06-03T07:15:00Z) - Adaptive Label Smoothing [1.3198689566654107]
We present a novel approach to classification that combines the ideas of objectness and label smoothing during training.
We show extensive results using ImageNet to demonstrate that CNNs trained using adaptive label smoothing are much less likely to be overconfident in their predictions.
arXiv Detail & Related papers (2020-09-14T13:37:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.