Related papers: Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance

Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance

URL: http://arxiv.org/abs/2403.15878v1
Date: Sat, 23 Mar 2024 16:08:48 GMT
Title: Diffusion-based Aesthetic QR Code Generation via Scanning-Robust Perceptual Guidance
Authors: Jia-Wei Liao, Winston Wang, Tzu-Sian Wang, Li-Xuan Peng, Cheng-Fu Chou, Jun-Cheng Chen,
Abstract summary: QR codes, prevalent in daily applications, lack visual appeal due to their conventional black-and-white design. We introduce a novel diffusion-model-based aesthetic QR code generation pipeline, utilizing pre-trained ControlNet and guided iterative refinement. With extensive quantitative, qualitative, and subjective experiments, the results demonstrate that the proposed approach can generate diverse aesthetic QR codes with flexibility in detail.
Score: 9.905296922309157
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: QR codes, prevalent in daily applications, lack visual appeal due to their conventional black-and-white design. Integrating aesthetics while maintaining scannability poses a challenge. In this paper, we introduce a novel diffusion-model-based aesthetic QR code generation pipeline, utilizing pre-trained ControlNet and guided iterative refinement via a novel classifier guidance (SRG) based on the proposed Scanning-Robust Loss (SRL) tailored with QR code mechanisms, which ensures both aesthetics and scannability. To further improve the scannability while preserving aesthetics, we propose a two-stage pipeline with Scanning-Robust Perceptual Guidance (SRPG). Moreover, we can further enhance the scannability of the generated QR code by post-processing it through the proposed Scanning-Robust Projected Gradient Descent (SRPGD) post-processing technique based on SRL with proven convergence. With extensive quantitative, qualitative, and subjective experiments, the results demonstrate that the proposed approach can generate diverse aesthetic QR codes with flexibility in detail. In addition, our pipelines outperforming existing models in terms of Scanning Success Rate (SSR) 86.67% (+40%) with comparable aesthetic scores. The pipeline combined with SRPGD further achieves 96.67% (+50%). Our code will be available https://github.com/jwliao1209/DiffQRCode.

Related papers

Robust Latent Matters: Boosting Image Generation with Sampling Error Synthesis [57.7367843129838]
Recent image generation schemes typically capture image distribution in a pre-constructed latent space relying on a frozen image tokenizer. We propose a novel plug-and-play tokenizer training scheme to facilitate latent space construction.
arXiv Detail & Related papers (2025-03-11T12:09:11Z)
Scalable Image Tokenization with Index Backpropagation Quantization [74.15447383432262]
Index Backpropagation Quantization (IBQ) is a new VQ method for the joint optimization of all codebook embeddings and the visual encoder. IBQ enables scalable training of visual tokenizers and, for the first time, achieves a large-scale codebook with high dimension ($256$) and high utilization.
arXiv Detail & Related papers (2024-12-03T18:59:10Z)
Face2QR: A Unified Framework for Aesthetic, Face-Preserving, and Scannable QR Code Generation [33.57668243458616]
Face2QR is a novel pipeline for generating personalized QR codes that blend aesthetics, face identity, and scannability. First, the ID-refined QR integration seamlessly intertwines the background styling with face ID. Second, the ID-aware QR ReShuffle (IDRS) effectively rectifies the conflicts between face IDs and QR patterns. Third, the ID-preserved Scannability Enhancement (IDSE) markedly boosts scanning through latent code optimization.
arXiv Detail & Related papers (2024-11-28T16:35:16Z)
DiffQRCoder: Diffusion-based Aesthetic QR Code Generation with Scanning Robustness Guided Iterative Refinement [9.43230708612551]
We propose a novel Diffusion-based QR Code generator (DiffQRCoder) to craft both scannable and visually pleasing QR codes. The proposed approach introduces Scanning-Robust Perceptual Guidance (SRPG), a new diffusion guidance for Diffusion Models. Our approach robustly achieves over 95% SSR, demonstrating its capability for real-world applications.
arXiv Detail & Related papers (2024-09-10T09:22:35Z)
Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation [38.281805719692194]
In the digital era, QR codes serve as a linchpin connecting virtual and physical realms. prevailing methods grapple with the intrinsic challenge of balancing customization and scannability. This paper introduces Text2QR, a pioneering approach leveraging stable-diffusion models.
arXiv Detail & Related papers (2024-03-11T06:03:31Z)
NeRF-VPT: Learning Novel View Representations with Neural Radiance Fields via View Prompt Tuning [63.39461847093663]
We propose NeRF-VPT, an innovative method for novel view synthesis to address these challenges. Our proposed NeRF-VPT employs a cascading view prompt tuning paradigm, wherein RGB information gained from preceding rendering outcomes serves as instructive visual prompts for subsequent rendering stages. NeRF-VPT only requires sampling RGB data from previous stage renderings as priors at each training stage, without relying on extra guidance or complex techniques.
arXiv Detail & Related papers (2024-03-02T22:08:10Z)
Iterative Token Evaluation and Refinement for Real-World Super-Resolution [77.74289677520508]
Real-world image super-resolution (RWSR) is a long-standing problem as low-quality (LQ) images often have complex and unidentified degradations. We propose an Iterative Token Evaluation and Refinement framework for RWSR. We show that ITER is easier to train than Generative Adversarial Networks (GANs) and more efficient than continuous diffusion models.
arXiv Detail & Related papers (2023-12-09T17:07:32Z)
RBSR: Efficient and Flexible Recurrent Network for Burst Super-Resolution [57.98314517861539]
Burst super-resolution (BurstSR) aims at reconstructing a high-resolution (HR) image from a sequence of low-resolution (LR) and noisy images. In this paper, we suggest fusing cues frame-by-frame with an efficient and flexible recurrent network.
arXiv Detail & Related papers (2023-06-30T12:14:13Z)
Rolling Shutter Inversion: Bring Rolling Shutter Images to High Framerate Global Shutter Video [111.08121952640766]
This paper presents a novel deep-learning based solution to the RS temporal super-resolution problem. By leveraging the multi-view geometry relationship of the RS imaging process, our framework successfully achieves high framerate GS generation. Our method can produce high-quality GS image sequences with rich details, outperforming the state-of-the-art methods.
arXiv Detail & Related papers (2022-10-06T16:47:12Z)
Visual Radial Basis Q-Network [0.2148535041822524]
We propose a generic method to extract sparse features from raw images with few trainable parameters. We show that the proposed approach provides similar or, in some cases, even better performances with fewer trainable parameters while being conceptually simpler.
arXiv Detail & Related papers (2022-06-14T09:34:34Z)
UltraSR: Spatial Encoding is a Missing Key for Implicit Image Function-based Arbitrary-Scale Super-Resolution [74.82282301089994]
In this work, we propose UltraSR, a simple yet effective new network design based on implicit image functions. We show that spatial encoding is indeed a missing key towards the next-stage high-accuracy implicit image function. Our UltraSR sets new state-of-the-art performance on the DIV2K benchmark under all super-resolution scales.
arXiv Detail & Related papers (2021-03-23T17:36:42Z)
An End-to-end Method for Producing Scanning-robust Stylized QR Codes [45.35370585928748]
We propose a novel end-to-end method, named ArtCoder, to generate stylized QR codes. The experimental results show that our stylized QR codes have high-quality in both the visual effect and the scanning-robustness.
arXiv Detail & Related papers (2020-11-16T09:38:27Z)
LinksIQ: Robust and Efficient Modulation Recognition with Imperfect Spectrum Scans [14.27482188246212]
LinksIQ bridges the gap between real-world spectrum sensing and modrec methods designed under simplifying assumptions. Our key insight is that ordered IQ samples form distinctive patterns across modulations, which persist even with scan deficiencies. Our results demonstrate the feasibility of low-cost transmitter fingerprinting at scale.
arXiv Detail & Related papers (2020-05-07T12:16:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.