Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash
- URL: http://arxiv.org/abs/2111.06628v5
- Date: Tue, 16 Jul 2024 06:48:41 GMT
- Title: Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash
- Authors: Lukas Struppek, Dominik Hintersdorf, Daniel Neider, Kristian Kersting,
- Abstract summary: Apple recently revealed its deep perceptual hashing system NeuralHash to detect child sexual abuse material.
Public criticism arose regarding the protection of user privacy and the system's reliability.
We show that current deep perceptual hashing may not be robust.
- Score: 29.722113621868978
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Apple recently revealed its deep perceptual hashing system NeuralHash to detect child sexual abuse material (CSAM) on user devices before files are uploaded to its iCloud service. Public criticism quickly arose regarding the protection of user privacy and the system's reliability. In this paper, we present the first comprehensive empirical analysis of deep perceptual hashing based on NeuralHash. Specifically, we show that current deep perceptual hashing may not be robust. An adversary can manipulate the hash values by applying slight changes in images, either induced by gradient-based approaches or simply by performing standard image transformations, forcing or preventing hash collisions. Such attacks permit malicious actors easily to exploit the detection system: from hiding abusive material to framing innocent users, everything is possible. Moreover, using the hash values, inferences can still be made about the data stored on user devices. In our view, based on our results, deep perceptual hashing in its current form is generally not ready for robust client-side scanning and should not be used from a privacy perspective.
Related papers
- Protecting Onion Service Users Against Phishing [1.6435014180036467]
Phishing websites are a common phenomenon among Tor onion services.
phishers exploit that it is tremendously difficult to distinguish phishing from authentic onion domain names.
Operators of onion services devised several strategies to protect their users against phishing.
None protect users against phishing without producing traces about visited services.
arXiv Detail & Related papers (2024-08-14T19:51:30Z) - ConceptHash: Interpretable Fine-Grained Hashing via Concept Discovery [128.30514851911218]
ConceptHash is a novel method that achieves sub-code level interpretability.
In ConceptHash, each sub-code corresponds to a human-understandable concept, such as an object part.
We incorporate language guidance to ensure that the learned hash codes are distinguishable within fine-grained object classes.
arXiv Detail & Related papers (2024-06-12T17:49:26Z) - Exploiting and Defending Against the Approximate Linearity of Apple's
NeuralHash [5.3888140834268246]
Apple's NeuralHash aims to detect the presence of illegal content on users' devices without compromising consumer privacy.
We make the surprising discovery that NeuralHash is approximately linear, which inspires the development of novel black-box attacks.
We propose a simple fix using classical cryptographic standards.
arXiv Detail & Related papers (2022-07-28T17:45:01Z) - BadHash: Invisible Backdoor Attacks against Deep Hashing with Clean
Label [20.236328601459203]
We propose BadHash, the first generative-based imperceptible backdoor attack against deep hashing.
We show that BadHash can generate imperceptible poisoned samples with strong attack ability and transferability over state-of-the-art deep hashing schemes.
arXiv Detail & Related papers (2022-07-01T09:10:25Z) - Self-Distilled Hashing for Deep Image Retrieval [25.645550298697938]
In hash-based image retrieval systems, transformed input from the original usually generates different codes.
We propose a novel self-distilled hashing scheme to minimize the discrepancy while exploiting the potential of augmented data.
We also introduce hash proxy-based similarity learning and binary cross entropy-based quantization loss to provide fine quality hash codes.
arXiv Detail & Related papers (2021-12-16T12:01:50Z) - Backdoor Attack on Hash-based Image Retrieval via Clean-label Data
Poisoning [54.15013757920703]
We propose the confusing perturbations-induced backdoor attack (CIBA)
It injects a small number of poisoned images with the correct label into the training data.
We have conducted extensive experiments to verify the effectiveness of our proposed CIBA.
arXiv Detail & Related papers (2021-09-18T07:56:59Z) - Adversarial collision attacks on image hashing functions [9.391375268580806]
We show that it is possible to modify an image to produce an unrelated hash, and an exact hash collision can be produced via minuscule perturbations.
In a white box setting, these collisions can be replicated across nearly every image pair and hash type.
We offer several potential mitigations to gradient-based image hash attacks.
arXiv Detail & Related papers (2020-11-18T18:59:02Z) - Deep Momentum Uncertainty Hashing [65.27971340060687]
We propose a novel Deep Momentum Uncertainty Hashing (DMUH)
It explicitly estimates the uncertainty during training and leverages the uncertainty information to guide the approximation process.
Our method achieves the best performance on all of the datasets and surpasses existing state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-09-17T01:57:45Z) - Deep Reinforcement Learning with Label Embedding Reward for Supervised
Image Hashing [85.84690941656528]
We introduce a novel decision-making approach for deep supervised hashing.
We learn a deep Q-network with a novel label embedding reward defined by Bose-Chaudhuri-Hocquenghem codes.
Our approach outperforms state-of-the-art supervised hashing methods under various code lengths.
arXiv Detail & Related papers (2020-08-10T09:17:20Z) - Targeted Attack for Deep Hashing based Retrieval [57.582221494035856]
We propose a novel method, dubbed deep hashing targeted attack (DHTA), to study the targeted attack on such retrieval.
We first formulate the targeted attack as a point-to-set optimization, which minimizes the average distance between the hash code of an adversarial example and those of a set of objects with the target label.
To balance the performance and perceptibility, we propose to minimize the Hamming distance between the hash code of the adversarial example and the anchor code under the $ellinfty$ restriction on the perturbation.
arXiv Detail & Related papers (2020-04-15T08:36:58Z) - A Survey on Deep Hashing Methods [52.326472103233854]
Nearest neighbor search aims to obtain the samples in the database with the smallest distances from them to the queries.
With the development of deep learning, deep hashing methods show more advantages than traditional methods.
Deep supervised hashing is categorized into pairwise methods, ranking-based methods, pointwise methods and quantization.
Deep unsupervised hashing is categorized into similarity reconstruction-based methods, pseudo-label-based methods and prediction-free self-supervised learning-based methods.
arXiv Detail & Related papers (2020-03-04T08:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.