SynthID-Image: Image watermarking at internet scale
- URL: http://arxiv.org/abs/2510.09263v1
- Date: Fri, 10 Oct 2025 11:03:31 GMT
- Title: SynthID-Image: Image watermarking at internet scale
- Authors: Sven Gowal, Rudy Bunel, Florian Stimberg, David Stutz, Guillermo Ortiz-Jimenez, Christina Kouridi, Mel Vecerik, Jamie Hayes, Sylvestre-Alvise Rebuffi, Paul Bernard, Chris Gamble, Miklós Z. Horváth, Fabian Kaczmarczyck, Alex Kaskasoli, Aleksandar Petrov, Ilia Shumailov, Meghana Thotakuri, Olivia Wiles, Jessica Yung, Zahra Ahmed, Victor Martin, Simon Rosen, Christopher Savčak, Armin Senoner, Nidhi Vyas, Pushmeet Kohli,
- Abstract summary: We introduce SynthID-Image, a deep learning-based system for invisibly watermarking AI-generated imagery.<n>This paper documents the technical desiderata, threat models, and practical challenges of deploying such a system at internet scale.
- Score: 55.5714762895087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce SynthID-Image, a deep learning-based system for invisibly watermarking AI-generated imagery. This paper documents the technical desiderata, threat models, and practical challenges of deploying such a system at internet scale, addressing key requirements of effectiveness, fidelity, robustness, and security. SynthID-Image has been used to watermark over ten billion images and video frames across Google's services and its corresponding verification service is available to trusted testers. For completeness, we present an experimental evaluation of an external model variant, SynthID-O, which is available through partnerships. We benchmark SynthID-O against other post-hoc watermarking methods from the literature, demonstrating state-of-the-art performance in both visual quality and robustness to common image perturbations. While this work centers on visual media, the conclusions on deployment, constraints, and threat modeling generalize to other modalities, including audio. This paper provides a comprehensive documentation for the large-scale deployment of deep learning-based media provenance systems.
Related papers
- How Well Do Models Follow Visual Instructions? VIBE: A Systematic Benchmark for Visual Instruction-Driven Image Editing [56.60465182650588]
We introduce three-level interaction hierarchy that captures deictic grounding, morphological manipulation, and causal reasoning.<n>We propose a robust LMM-as-a-judge evaluation framework with task-specific metrics to enable scalable and fine-grained assessment.<n>We find that proprietary models exhibit early-stage visual instruction-following capabilities and consistently outperform open-source models.
arXiv Detail & Related papers (2026-02-02T09:24:45Z) - Provenance of AI-Generated Images: A Vector Similarity and Blockchain-based Approach [3.632189127068905]
We propose an embedding-based AI image detection framework to distinguish AI-generated images from real (human-created) ones.<n>Our methodology is built on the hypothesis that AI-generated images demonstrate closer embedding proximity to other AI-generated content.<n>Our results confirm that moderate to high perturbations minimally impact the embedding signatures, with perturbed images maintaining close similarity matches to their original versions.
arXiv Detail & Related papers (2025-10-15T00:49:56Z) - OmniDFA: A Unified Framework for Open Set Synthesis Image Detection and Few-Shot Attribution [21.62979058692505]
OmniDFA is a novel framework for AIGI that assesses the authenticity of images, and determines the origins in a few-shot manner.<n>We construct OmniFake, a large class-aware synthetic image dataset that curates $1.17$ M images from $45$ distinct generative models.<n>Experiments demonstrate that OmniDFA exhibits excellent capability in open-set attribution and achieves state-of-the-art generalization performance on AIGI detection.
arXiv Detail & Related papers (2025-09-30T02:36:40Z) - BusterX++: Towards Unified Cross-Modal AI-Generated Content Detection and Explanation with MLLM [12.349038994581415]
We introduce textbfBusterX++, a novel framework for cross-modal detection and explanation of synthetic media.<n>Our approach incorporates an advanced reinforcement learning (RL) post-training strategy that eliminates cold-start.<n>We also present textbfGenBuster++, a cross-modal benchmark leveraging state-of-the-art image and video generation techniques.
arXiv Detail & Related papers (2025-07-19T14:05:33Z) - Towards Effective User Attribution for Latent Diffusion Models via Watermark-Informed Blending [54.26862913139299]
We introduce a novel framework Towards Effective user Attribution for latent diffusion models via Watermark-Informed Blending (TEAWIB)<n> TEAWIB incorporates a unique ready-to-use configuration approach that allows seamless integration of user-specific watermarks into generative models.<n>Experiments validate the effectiveness of TEAWIB, showcasing the state-of-the-art performance in perceptual quality and attribution accuracy.
arXiv Detail & Related papers (2024-09-17T07:52:09Z) - UNIT: Unifying Image and Text Recognition in One Vision Encoder [51.140564856352825]
UNIT is a novel training framework aimed at UNifying Image and Text recognition within a single model.
We show that UNIT significantly outperforms existing methods on document-related tasks.
Notably, UNIT retains the original vision encoder architecture, making it cost-free in terms of inference and deployment.
arXiv Detail & Related papers (2024-09-06T08:02:43Z) - SIDBench: A Python Framework for Reliably Assessing Synthetic Image Detection Methods [9.213926755375024]
The creation of completely synthetic images presents a unique challenge.
There is often a large gap between experimental results on benchmark datasets and the performance of methods in the wild.
This paper introduces a benchmarking framework that integrates several state-of-the-art SID models.
arXiv Detail & Related papers (2024-04-29T09:50:16Z) - DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images [6.75641797020186]
Deepfakes, synthetic images generated by deep learning algorithms, represent one of the biggest challenges in the field of Digital Forensics.
We propose a novel approach based on three blocks called Base Models.
The generalization features extracted from each block are then processed to discriminate the origin of the input image.
arXiv Detail & Related papers (2024-04-24T07:25:36Z) - WAVES: Benchmarking the Robustness of Image Watermarks [67.955140223443]
WAVES (Watermark Analysis Via Enhanced Stress-testing) is a benchmark for assessing image watermark robustness.
We integrate detection and identification tasks and establish a standardized evaluation protocol comprised of a diverse range of stress tests.
We envision WAVES as a toolkit for the future development of robust watermarks.
arXiv Detail & Related papers (2024-01-16T18:58:36Z) - Harnessing Machine Learning for Discerning AI-Generated Synthetic Images [2.6227376966885476]
We employ machine learning techniques to discern between AI-generated and genuine images.
We refine and adapt advanced deep learning architectures like ResNet, VGGNet, and DenseNet.
The experimental results were significant, demonstrating that our optimized deep learning models outperform traditional methods.
arXiv Detail & Related papers (2024-01-14T20:00:37Z) - T2IW: Joint Text to Image & Watermark Generation [74.20148555503127]
We introduce a novel task for the joint generation of text to image and watermark (T2IW)
This T2IW scheme ensures minimal damage to image quality when generating a compound image by forcing the semantic feature and the watermark signal to be compatible in pixels.
We demonstrate remarkable achievements in image quality, watermark invisibility, and watermark robustness, supported by our proposed set of evaluation metrics.
arXiv Detail & Related papers (2023-09-07T16:12:06Z) - Identity-Aware CycleGAN for Face Photo-Sketch Synthesis and Recognition [61.87842307164351]
We first propose an Identity-Aware CycleGAN (IACycleGAN) model that applies a new perceptual loss to supervise the image generation network.
It improves CycleGAN on photo-sketch synthesis by paying more attention to the synthesis of key facial regions, such as eyes and nose.
We develop a mutual optimization procedure between the synthesis model and the recognition model, which iteratively synthesizes better images by IACycleGAN.
arXiv Detail & Related papers (2021-03-30T01:30:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.