Privacy Assessment on Reconstructed Images: Are Existing Evaluation
Metrics Faithful to Human Perception?
- URL: http://arxiv.org/abs/2309.13038v2
- Date: Mon, 9 Oct 2023 07:56:19 GMT
- Title: Privacy Assessment on Reconstructed Images: Are Existing Evaluation
Metrics Faithful to Human Perception?
- Authors: Xiaoxiao Sun, Nidham Gazagnadou, Vivek Sharma, Lingjuan Lyu, Hongdong
Li, Liang Zheng
- Abstract summary: We study the faithfulness of hand-crafted metrics to human perception of privacy information from reconstructed images.
We propose a learning-based measure called SemSim to evaluate the Semantic Similarity between the original and reconstructed images.
- Score: 86.58989831070426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hand-crafted image quality metrics, such as PSNR and SSIM, are commonly used
to evaluate model privacy risk under reconstruction attacks. Under these
metrics, reconstructed images that are determined to resemble the original one
generally indicate more privacy leakage. Images determined as overall
dissimilar, on the other hand, indicate higher robustness against attack.
However, there is no guarantee that these metrics well reflect human opinions,
which, as a judgement for model privacy leakage, are more trustworthy. In this
paper, we comprehensively study the faithfulness of these hand-crafted metrics
to human perception of privacy information from the reconstructed images. On 5
datasets ranging from natural images, faces, to fine-grained classes, we use 4
existing attack methods to reconstruct images from many different
classification models and, for each reconstructed image, we ask multiple human
annotators to assess whether this image is recognizable. Our studies reveal
that the hand-crafted metrics only have a weak correlation with the human
evaluation of privacy leakage and that even these metrics themselves often
contradict each other. These observations suggest risks of current metrics in
the community. To address this potential risk, we propose a learning-based
measure called SemSim to evaluate the Semantic Similarity between the original
and reconstructed images. SemSim is trained with a standard triplet loss, using
an original image as an anchor, one of its recognizable reconstructed images as
a positive sample, and an unrecognizable one as a negative. By training on
human annotations, SemSim exhibits a greater reflection of privacy leakage on
the semantic level. We show that SemSim has a significantly higher correlation
with human judgment compared with existing metrics. Moreover, this strong
correlation generalizes to unseen datasets, models and attack methods.
Related papers
- Five Pitfalls When Assessing Synthetic Medical Images with Reference Metrics [0.9582978458237521]
Reference metrics have been developed to objectively and quantitatively compare two images.
The correlation of reference metrics and human perception of quality can vary strongly for different kinds of distortions.
We selected five pitfalls that showcase unexpected and probably undesired reference metric scores.
arXiv Detail & Related papers (2024-08-12T11:48:57Z) - How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples [8.483679748399036]
adversarial examples threaten the safety of AI-based systems such as autonomous vehicles.
In the image domain, they represent maliciously perturbed data points that look benign to humans.
We propose SCOOTER - an evaluation framework for unrestricted image-based attacks.
arXiv Detail & Related papers (2024-04-19T06:42:01Z) - Introspective Deep Metric Learning [91.47907685364036]
We propose an introspective deep metric learning framework for uncertainty-aware comparisons of images.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling.
arXiv Detail & Related papers (2023-09-11T16:21:13Z) - Exploring the Robustness of Human Parsers Towards Common Corruptions [99.89886010550836]
We construct three corruption robustness benchmarks, termed LIP-C, ATR-C, and Pascal-Person-Part-C, to assist us in evaluating the risk tolerance of human parsing models.
Inspired by the data augmentation strategy, we propose a novel heterogeneous augmentation-enhanced mechanism to bolster robustness under commonly corrupted conditions.
arXiv Detail & Related papers (2023-09-02T13:32:14Z) - DreamSim: Learning New Dimensions of Human Visual Similarity using
Synthetic Data [43.247597420676044]
Current perceptual similarity metrics operate at the level of pixels and patches.
These metrics compare images in terms of their low-level colors and textures, but fail to capture mid-level similarities and differences in image layout, object pose, and semantic content.
We develop a perceptual metric that assesses images holistically.
arXiv Detail & Related papers (2023-06-15T17:59:50Z) - Introspective Deep Metric Learning for Image Retrieval [80.29866561553483]
We argue that a good similarity model should consider the semantic discrepancies with caution to better deal with ambiguous images for more robust training.
We propose to represent an image using not only a semantic embedding but also an accompanying uncertainty embedding, which describes the semantic characteristics and ambiguity of an image, respectively.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling and attains state-of-the-art results on the widely used CUB-200-2011, Cars196, and Stanford Online Products datasets.
arXiv Detail & Related papers (2022-05-09T17:51:44Z) - Assessing Privacy Risks from Feature Vector Reconstruction Attacks [24.262351521060676]
We develop metrics that meaningfully capture the threat of reconstructed face images.
We show that reconstructed face images enable re-identification by both commercial facial recognition systems and humans.
Our results confirm that feature vectors should be recognized as Personal Identifiable Information.
arXiv Detail & Related papers (2022-02-11T16:52:02Z) - Assessing a Single Image in Reference-Guided Image Synthesis [14.936460594115953]
We propose a learning-based framework, Reference-guided Image Synthesis Assessment (RISA) to quantitatively evaluate the quality of a single generated image.
As this annotation is too coarse as a supervision signal, we introduce two techniques: 1) a pixel-wise scheme to refine the coarse labels, and 2) multiple binary classifiers to replace a na"ive regressor.
RISA is highly consistent with human preference and transfers well across models.
arXiv Detail & Related papers (2021-12-08T08:22:14Z) - Transparent Human Evaluation for Image Captioning [70.03979566548823]
We develop a rubric-based human evaluation protocol for image captioning models.
We show that human-generated captions show substantially higher quality than machine-generated ones.
We hope that this work will promote a more transparent evaluation protocol for image captioning.
arXiv Detail & Related papers (2021-11-17T07:09:59Z) - Unsupervised Landmark Learning from Unpaired Data [117.81440795184587]
Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.
We propose a cross-image cycle consistency framework which applies the swapping-reconstruction strategy twice to obtain the final supervision.
Our proposed framework is shown to outperform strong baselines by a large margin.
arXiv Detail & Related papers (2020-06-29T13:57:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.