Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
- URL: http://arxiv.org/abs/2506.01331v1
- Date: Mon, 02 Jun 2025 05:19:40 GMT
- Title: Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
- Authors: Jinjin Zhang, Qiuyu Huang, Junjie Liu, Xiefan Guo, Di Huang,
- Abstract summary: Aesthetic-4K dataset is curated for comprehensive research on ultra-high-resolution image synthesis.<n>Diffusion-4K is an innovative framework for the direct generation of ultra-high-resolution images.
- Score: 21.46605047406198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ultra-high-resolution image synthesis holds significant potential, yet remains an underexplored challenge due to the absence of standardized benchmarks and computational constraints. In this paper, we establish Aesthetic-4K, a meticulously curated dataset containing dedicated training and evaluation subsets specifically designed for comprehensive research on ultra-high-resolution image synthesis. This dataset consists of high-quality 4K images accompanied by descriptive captions generated by GPT-4o. Furthermore, we propose Diffusion-4K, an innovative framework for the direct generation of ultra-high-resolution images. Our approach incorporates the Scale Consistent Variational Auto-Encoder (SC-VAE) and Wavelet-based Latent Fine-tuning (WLF), which are designed for efficient visual token compression and the capture of intricate details in ultra-high-resolution images, thereby facilitating direct training with photorealistic 4K data. This method is applicable to various latent diffusion models and demonstrates its efficacy in synthesizing highly detailed 4K images. Additionally, we propose novel metrics, namely the GLCM Score and Compression Ratio, to assess the texture richness and fine details in local patches, in conjunction with holistic measures such as FID, Aesthetics, and CLIPScore, enabling a thorough and multifaceted evaluation of ultra-high-resolution image synthesis. Consequently, Diffusion-4K achieves impressive performance in ultra-high-resolution image synthesis, particularly when powered by state-of-the-art large-scale diffusion models (eg, Flux-12B). The source code is publicly available at https://github.com/zhang0jhon/diffusion-4k.
Related papers
- Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models [21.46605047406198]
Diffusion-4K is a novel framework for direct ultra-high-resolution image synthesis using text-to-image diffusion models.<n>We construct Aesthetic-4K, a comprehensive benchmark for ultra-high-resolution image generation.<n>We propose a wavelet-based fine-tuning approach for direct training with 4K images, applicable to various latent diffusion models.
arXiv Detail & Related papers (2025-03-24T05:25:07Z) - EG4D: Explicit Generation of 4D Object without Score Distillation [105.63506584772331]
DG4D is a novel framework that generates high-quality and consistent 4D assets without score distillation.
Our framework outperforms the baselines in generation quality by a considerable margin.
arXiv Detail & Related papers (2024-05-28T12:47:22Z) - Memory-efficient High-resolution OCT Volume Synthesis with Cascaded Amortized Latent Diffusion Models [48.87160158792048]
We introduce a cascaded amortized latent diffusion model (CA-LDM) that can synthesis high-resolution OCT volumes in a memory-efficient way.
Experiments on a public high-resolution OCT dataset show that our synthetic data have realistic high-resolution and global features, surpassing the capabilities of existing methods.
arXiv Detail & Related papers (2024-05-26T10:58:22Z) - Probabilistic-based Feature Embedding of 4-D Light Fields for
Compressive Imaging and Denoising [62.347491141163225]
4-D light field (LF) poses great challenges in achieving efficient and effective feature embedding.
We propose a probabilistic-based feature embedding (PFE), which learns a feature embedding architecture by assembling various low-dimensional convolution patterns.
Our experiments demonstrate the significant superiority of our methods on both real-world and synthetic 4-D LF images.
arXiv Detail & Related papers (2023-06-15T03:46:40Z) - 4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions [19.380248980850727]
We present a novel and effective framework, named 4K-NeRF, to pursue high fidelity view synthesis on the challenging scenarios of ultra high resolutions.
We address the issue by exploring ray correlation to enhance high-frequency details recovery.
Our method can significantly boost rendering quality on high-frequency details compared with modern NeRF methods, and achieve the state-of-the-art visual quality on 4K ultra-high-resolution scenarios.
arXiv Detail & Related papers (2022-12-09T07:26:49Z) - Towards Efficient and Scale-Robust Ultra-High-Definition Image
Demoireing [71.62289021118983]
We present an efficient baseline model ESDNet for tackling 4K moire images, wherein we build a semantic-aligned scale-aware module to address the scale variation of moire patterns.
Our approach outperforms state-of-the-art methods by a large margin while being much more lightweight.
arXiv Detail & Related papers (2022-07-20T14:20:52Z) - OADAT: Experimental and Synthetic Clinical Optoacoustic Data for
Standardized Image Processing [62.993663757843464]
Optoacoustic (OA) imaging is based on excitation of biological tissues with nanosecond-duration laser pulses followed by detection of ultrasound waves generated via light-absorption-mediated thermoelastic expansion.
OA imaging features a powerful combination between rich optical contrast and high resolution in deep tissues.
No standardized datasets generated with different types of experimental set-up and associated processing methods are available to facilitate advances in broader applications of OA in clinical settings.
arXiv Detail & Related papers (2022-06-17T08:11:26Z) - High Quality Segmentation for Ultra High-resolution Images [72.97958314291648]
We propose the Continuous Refinement Model for the ultra high-resolution segmentation refinement task.
Our proposed method is fast and effective on image segmentation refinement.
arXiv Detail & Related papers (2021-11-29T11:53:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.