Related papers: FakeIDet: Exploring Patches for Privacy-Preserving Fake ID Detection

FakeIDet: Exploring Patches for Privacy-Preserving Fake ID Detection

URL: http://arxiv.org/abs/2504.07761v2
Date: Fri, 01 Aug 2025 14:48:23 GMT
Title: FakeIDet: Exploring Patches for Privacy-Preserving Fake ID Detection
Authors: Javier Muñoz-Haro, Ruben Tolosana, Ruben Vera-Rodriguez, Aythami Morales, Julian Fierrez,
Abstract summary: This study focuses on the topic of fake ID detection, covering several limitations in the field.<n>There are no publicly available data from real IDs for proper research in this area.<n>Most published studies rely on proprietary internal databases that are not available for privacy reasons.
Score: 12.969417519807322
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Verifying the authenticity of identity documents (IDs) has become a critical challenge for real-life applications such as digital banking, crypto-exchanges, renting, etc. This study focuses on the topic of fake ID detection, covering several limitations in the field. In particular, there are no publicly available data from real IDs for proper research in this area, and most published studies rely on proprietary internal databases that are not available for privacy reasons. In order to advance this critical challenge of real data scarcity that makes it so difficult to advance the technology of machine learning-based fake ID detection, we introduce a new patch-based methodology that trades off privacy and performance, and propose a novel patch-wise approach for privacy-aware fake ID detection: FakeIDet. In our experiments, we explore: i) two levels of anonymization for an ID (i.e., fully- and pseudo-anonymized), and ii) different patch size configurations, varying the amount of sensitive data visible in the patch image. State-of-the-art methods, such as vision transformers and foundation models, are considered as backbones. Our results show that, on an unseen database (DLC-2021), our proposal for fake ID detection achieves 13.91% and 0% EERs at the patch and the whole ID level, showing a good generalization to other databases. In addition to the path-based methodology introduced and the new FakeIDet method based on it, another key contribution of our article is the release of the first publicly available database that contains 48,400 patches from real and fake IDs, called FakeIDet-db, together with the experimental framework.

Related papers

FantasyID: A dataset for detecting digital manipulations of ID-documents [23.7548607375651]
We propose a novel dataset, FantasyID, which mimics real-world IDs but without tampering with legal documents.<n>FantasyID contains ID cards with diverse design styles, languages, and faces of real people.<n>We have emulated digital forgery/injection attacks that could be performed by a malicious actor to tamper the IDs using the existing generative tools.
arXiv Detail & Related papers (2025-07-28T13:20:18Z)
LLM for Barcodes: Generating Diverse Synthetic Data for Identity Documents [2.697503433221448]
We introduce a new approach to synthetic data generation that uses LLMs to create contextually rich and realistic data without relying on predefined field.<n>Our approach simplifies the process of dataset creation, eliminating the need for extensive domain knowledge.<n>This scalable, privacy-first solution is a big step forward in advancing machine learning for automated document processing and identity verification.
arXiv Detail & Related papers (2024-11-22T14:21:18Z)
ID-Patch: Robust ID Association for Group Photo Personalization [29.38844265790726]
ID-Patch is a novel method that provides robust association between identities and 2D positions.<n>Our approach generates an ID patch and ID embeddings from the same facial features.
arXiv Detail & Related papers (2024-11-20T18:55:28Z)
IDNet: A Novel Dataset for Identity Document Analysis and Fraud Detection [25.980165854663145]
IDNet is a benchmark dataset designed to advance privacy-preserving fraud detection efforts. It comprises 837,060 images of synthetically generated identity documents, totaling approximately 490 gigabytes. We evaluate the utility and present use cases of the dataset, illustrating how it can aid in training privacy-preserving fraud detection methods.
arXiv Detail & Related papers (2024-08-03T07:05:40Z)
DocXPand-25k: a large and diverse benchmark dataset for identity documents analysis [0.0]
Identity document (ID) image analysis has become essential for many online services, like bank account opening or insurance subscription. There are only a few available to benchmark ID analysis methods, mainly because of privacy restrictions, security requirements and legal reasons. We present the DocXPand-25k dataset, which consists of 24,994 richly labeled IDs images.
arXiv Detail & Related papers (2024-07-30T08:55:27Z)
Keypoint Promptable Re-Identification [76.31113049256375]
Occluded Person Re-Identification (ReID) is a metric learning task that involves matching occluded individuals based on their appearance. We introduce Keypoint Promptable ReID (KPR), a novel formulation of the ReID problem that explicitly complements the input bounding box with a set of semantic keypoints. We release custom keypoint labels for four popular ReID benchmarks. Experiments on person retrieval, but also on pose tracking, demonstrate that our method systematically surpasses previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-07-25T15:20:58Z)
Privacy-Aware Document Visual Question Answering [44.82362488593259]
This work highlights privacy issues in state of the art multi-modal LLM models used for DocVQA. We propose a large scale DocVQA dataset comprising invoice documents and associated questions and answers. We demonstrate that non-private models tend to memorise, a behaviour that can lead to exposing private information.
arXiv Detail & Related papers (2023-12-15T06:30:55Z)
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding [102.07914175196817]
PhotoMaker is an efficient personalized text-to-image generation method. It encodes an arbitrary number of input ID images into a stack ID embedding for preserving ID information.
arXiv Detail & Related papers (2023-12-07T17:32:29Z)
CrossDF: Improving Cross-Domain Deepfake Detection with Deep Information Decomposition [53.860796916196634]
We propose a Deep Information Decomposition (DID) framework to enhance the performance of Cross-dataset Deepfake Detection (CrossDF) Unlike most existing deepfake detection methods, our framework prioritizes high-level semantic features over specific visual artifacts. It adaptively decomposes facial features into deepfake-related and irrelevant information, only using the intrinsic deepfake-related information for real/fake discrimination.
arXiv Detail & Related papers (2023-09-30T12:30:25Z)
A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing. Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data. Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z)
Diff-Privacy: Diffusion-based Face Privacy Protection [58.1021066224765]
In this paper, we propose a novel face privacy protection method based on diffusion models, dubbed Diff-Privacy. Specifically, we train our proposed multi-scale image inversion module (MSI) to obtain a set of SDM format conditional embeddings of the original image. Based on the conditional embeddings, we design corresponding embedding scheduling strategies and construct different energy functions during the denoising process to achieve anonymization and visual identity information hiding.
arXiv Detail & Related papers (2023-09-11T09:26:07Z)
Disguise without Disruption: Utility-Preserving Face De-Identification [40.484745636190034]
We introduce Disguise, a novel algorithm that seamlessly de-identifies facial images while ensuring the usability of the modified data. Our method involves extracting and substituting depicted identities with synthetic ones, generated using variational mechanisms to maximize obfuscation and non-invertibility. We extensively evaluate our method using multiple datasets, demonstrating a higher de-identification rate and superior consistency compared to prior approaches in various downstream tasks.
arXiv Detail & Related papers (2023-03-23T13:50:46Z)
Unsupervised Text Deidentification [101.2219634341714]
We propose an unsupervised deidentification method that masks words that leak personally-identifying information. Motivated by K-anonymity based privacy, we generate redactions that ensure a minimum reidentification rank.
arXiv Detail & Related papers (2022-10-20T18:54:39Z)
RealGait: Gait Recognition for Person Re-Identification [79.67088297584762]
We construct a new gait dataset by extracting silhouettes from an existing video person re-identification challenge which consists of 1,404 persons walking in an unconstrained manner. Our results suggest that recognizing people by their gait in real surveillance scenarios is feasible and the underlying gait pattern is probably the true reason why video person re-idenfification works in practice.
arXiv Detail & Related papers (2022-01-13T06:30:56Z)
Privacy-Aware Identity Cloning Detection based on Deep Forest [9.051524543426451]
This approach leverages non-privacy-sensitive user profile data gathered from social networks and a powerful deep learning model to perform cloned identity detection. We evaluated the proposed method against the state-of-the-art identity cloning detection techniques and the other popular identity deception detection models atop a real-world dataset.
arXiv Detail & Related papers (2021-10-21T04:55:52Z)
MIDV-2020: A Comprehensive Benchmark Dataset for Identity Document Analysis [48.35030471041193]
MIDV-2020 consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents. With 72409 annotated images in total, to the date of publication the proposed dataset is the largest publicly available identity documents dataset.
arXiv Detail & Related papers (2021-07-01T12:14:17Z)
Camera-aware Proxies for Unsupervised Person Re-Identification [60.26031011794513]
This paper tackles the purely unsupervised person re-identification (Re-ID) problem that requires no annotations. We propose to split each single cluster into multiple proxies and each proxy represents the instances coming from the same camera. Based on the camera-aware proxies, we design both intra- and inter-camera contrastive learning components for our Re-ID model.
arXiv Detail & Related papers (2020-12-19T12:37:04Z)
Identity-Driven DeepFake Detection [91.0504621868628]
Identity-Driven DeepFake Detection takes as input the suspect image/video as well as the target identity information. We output a decision on whether the identity in the suspect image/video is the same as the target identity. We present a simple identity-based detection algorithm called the OuterFace, which may serve as a baseline for further research.
arXiv Detail & Related papers (2020-12-07T18:59:08Z)
How important are faces for person re-identification? [14.718372669984364]
We apply a face detection and blurring algorithm to create anonymized versions of several popular person re-identification datasets. We evaluate the effect of this anonymization on re-identification performance using standard metrics.
arXiv Detail & Related papers (2020-10-13T11:47:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.