Related papers: TokenTrace: Multi-Concept Attribution through Watermarked Token Recovery

TokenTrace: Multi-Concept Attribution through Watermarked Token Recovery

URL: http://arxiv.org/abs/2602.19019v1
Date: Sun, 22 Feb 2026 03:18:45 GMT
Title: TokenTrace: Multi-Concept Attribution through Watermarked Token Recovery
Authors: Li Zhang, Shruti Agarwal, John Collomosse, Pengtao Xie, Vishal Asnani,
Abstract summary: We introduce TokenTrace, a novel proactive watermarking framework for robust, multi-concept attribution.<n>Our method embeds secret signatures into the semantic domain by simultaneously perturbing the text prompt embedding.<n>For retrieval, we propose a query-based TokenTrace module that takes the generated image and a textual query specifying which concepts need to be retrieved.
Score: 28.249766802791772
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generative AI models pose a significant challenge to intellectual property (IP), as they can replicate unique artistic styles and concepts without attribution. While watermarking offers a potential solution, existing methods often fail in complex scenarios where multiple concepts (e.g., an object and an artistic style) are composed within a single image. These methods struggle to disentangle and attribute each concept individually. In this work, we introduce TokenTrace, a novel proactive watermarking framework for robust, multi-concept attribution. Our method embeds secret signatures into the semantic domain by simultaneously perturbing the text prompt embedding and the initial latent noise that guide the diffusion model's generation process. For retrieval, we propose a query-based TokenTrace module that takes the generated image and a textual query specifying which concepts need to be retrieved (e.g., a specific object or style) as inputs. This query-based mechanism allows the module to disentangle and independently verify the presence of multiple concepts from a single generated image. Extensive experiments show that our method achieves state-of-the-art performance on both single-concept (object and style) and multi-concept attribution tasks, significantly outperforming existing baselines while maintaining high visual quality and robustness to common transformations.

Related papers

Hierarchical Concept-to-Appearance Guidance for Multi-Subject Image Generation [22.845591588026366]
We propose a framework that provides explicit, structured supervision from high-level concepts to fine-grained appearances.<n>At the conceptual level, we introduce a VAE dropout training strategy that randomly omits reference VAE features.<n>At the appearance level, we integrate the VLM-derived correspondences into a correspondence-aware masked attention module.
arXiv Detail & Related papers (2026-02-03T12:13:29Z)
Nested Attention: Semantic-aware Attention Values for Concept Personalization [78.90196530697897]
We introduce Nested Attention, a novel mechanism that injects a rich and expressive image representation into the model's existing cross-attention layers.<n>Our key idea is to generate query-dependent subject values, derived from nested attention layers that learn to select relevant subject features for each region in the generated image.
arXiv Detail & Related papers (2025-01-02T18:52:11Z)
FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition [49.2208591663092]
FreeCustom is a tuning-free method to generate customized images of multi-concept composition based on reference concepts. We introduce a new multi-reference self-attention (MRSA) mechanism and a weighted mask strategy. Our method outperforms or performs on par with other training-based methods in terms of multi-concept composition and single-concept customization.
arXiv Detail & Related papers (2024-05-22T17:53:38Z)
Concept Weaver: Enabling Multi-Concept Fusion in Text-to-Image Models [85.14042557052352]
We introduce Concept Weaver, a method for composing customized text-to-image diffusion models at inference time. We show that Concept Weaver can generate multiple custom concepts with higher identity fidelity compared to alternative approaches.
arXiv Detail & Related papers (2024-04-05T06:41:27Z)
Visual Concept-driven Image Generation with Text-to-Image Diffusion Model [65.96212844602866]
Text-to-image (TTI) models have demonstrated impressive results in generating high-resolution images of complex scenes.<n>Recent approaches have extended these methods with personalization techniques that allow them to integrate user-illustrated concepts.<n>However, the ability to generate images with multiple interacting concepts, such as human subjects, as well as concepts that may be entangled in one, or across multiple, image illustrations remains illusive.<n>We propose a concept-driven TTI personalization framework that addresses these core challenges.
arXiv Detail & Related papers (2024-02-18T07:28:37Z)
Textual Localization: Decomposing Multi-concept Images for Subject-Driven Text-to-Image Generation [5.107886283951882]
We introduce a localized text-to-image model to handle multi-concept input images. Our method incorporates a novel cross-attention guidance to decompose multiple concepts. Notably, our method generates cross-attention maps consistent with the target concept in the generated images.
arXiv Detail & Related papers (2024-02-15T14:19:42Z)
Break-A-Scene: Extracting Multiple Concepts from a Single Image [80.47666266017207]
We introduce the task of textual scene decomposition. We propose augmenting the input image with masks that indicate the presence of target concepts. We then present a novel two-phase customization process.
arXiv Detail & Related papers (2023-05-25T17:59:04Z)
Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models [55.04969603431266]
This paper proposes a method for generating images of customized objects specified by users. The method is based on a general framework that bypasses the lengthy optimization required by previous approaches. We demonstrate through experiments that our proposed method is able to synthesize images with compelling output quality, appearance diversity, and object fidelity.
arXiv Detail & Related papers (2023-04-05T17:59:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.