Spectral Dictionary Learning for Generative Image Modeling
- URL: http://arxiv.org/abs/2504.17804v1
- Date: Mon, 21 Apr 2025 01:11:17 GMT
- Title: Spectral Dictionary Learning for Generative Image Modeling
- Authors: Andrew Kiruluta,
- Abstract summary: We propose a novel spectral generative model for image synthesis.<n>Images are reconstructed as linear combinations of a set of learned spectral basis functions.<n>We show that our approach achieves competitive performance in terms of reconstruction quality and perceptual fidelity.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel spectral generative model for image synthesis that departs radically from the common variational, adversarial, and diffusion paradigms. In our approach, images, after being flattened into one-dimensional signals, are reconstructed as linear combinations of a set of learned spectral basis functions, where each basis is explicitly parameterized in terms of frequency, phase, and amplitude. The model jointly learns a global spectral dictionary with time-varying modulations and per-image mixing coefficients that quantify the contributions of each spectral component. Subsequently, a simple probabilistic model is fitted to these mixing coefficients, enabling the deterministic generation of new images by sampling from the latent space. This framework leverages deterministic dictionary learning, offering a highly interpretable and physically meaningful representation compared to methods relying on stochastic inference or adversarial training. Moreover, the incorporation of frequency-domain loss functions, computed via the short-time Fourier transform (STFT), ensures that the synthesized images capture both global structure and fine-grained spectral details, such as texture and edge information. Experimental evaluations on the CIFAR-10 benchmark demonstrate that our approach not only achieves competitive performance in terms of reconstruction quality and perceptual fidelity but also offers improved training stability and computational efficiency. This new type of generative model opens up promising avenues for controlled synthesis, as the learned spectral dictionary affords a direct handle on the intrinsic frequency content of the images, thus providing enhanced interpretability and potential for novel applications in image manipulation and analysis.
Related papers
- A Hybrid Wavelet-Fourier Method for Next-Generation Conditional Diffusion Models [0.0]
We present a novel generative modeling framework,Wavelet-Fourier-Diffusion, which adapts the diffusion paradigm to hybrid frequency representations.<n>We show how the hybrid frequency-based representation improves control over global coherence and fine texture synthesis.
arXiv Detail & Related papers (2025-04-04T17:11:04Z) - Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method [60.88467353578118]
We show that a fixed-point-inspired iterative approach to invert real-world images does not achieve convergence, instead oscillating between distinct clusters.
We introduce a simple and fast distribution transfer technique that facilitates image enhancement, stroke-based recoloring, as well as visual prompt-guided image editing.
arXiv Detail & Related papers (2024-11-17T17:45:37Z) - Spectrum Translation for Refinement of Image Generation (STIG) Based on
Contrastive Learning and Spectral Filter Profile [15.5188527312094]
We propose a framework to mitigate the disparity in frequency domain of the generated images.
This is realized by spectrum translation for the refinement of image generation (STIG) based on contrastive learning.
We evaluate our framework across eight fake image datasets and various cutting-edge models to demonstrate the effectiveness of STIG.
arXiv Detail & Related papers (2024-03-08T06:39:24Z) - DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral
Diffusion Model [18.25548360119976]
This paper endeavors to advance the precision of snapshot compressive imaging (SCI) reconstruction for multispectral image (MSI)
We propose a novel structured zero-shot diffusion model, dubbed DiffSCI.
We present extensive testing to show that DiffSCI exhibits discernible performance enhancements over prevailing self-supervised and zero-shot approaches.
arXiv Detail & Related papers (2023-11-19T20:27:14Z) - Reconstruction of compressed spectral imaging based on global structure
and spectral correlation [17.35611893815407]
The proposed method uses the convolution kernel to operate the global image.
To solve the problem that convolutional sparse coding is insensitive to low frequency, the global total-variation (TV) constraint is added.
The proposed method improves the reconstruction quality by up to 7 dB in PSNR and 10% in SSIM.
arXiv Detail & Related papers (2022-10-27T14:31:02Z) - Semantic Image Synthesis via Diffusion Models [174.24523061460704]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.<n>Recent work on semantic image synthesis mainly follows the de facto GAN-based approaches.<n>We propose a novel framework based on DDPM for semantic image synthesis.
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Regularization by Denoising Sub-sampled Newton Method for Spectral CT
Multi-Material Decomposition [78.37855832568569]
We propose to solve a model-based maximum-a-posterior problem to reconstruct multi-materials images with application to spectral CT.
In particular, we propose to solve a regularized optimization problem based on a plug-in image-denoising function.
We show numerical and experimental results for spectral CT materials decomposition.
arXiv Detail & Related papers (2021-03-25T15:20:10Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Kullback-Leibler Divergence-Based Fuzzy $C$-Means Clustering
Incorporating Morphological Reconstruction and Wavelet Frames for Image
Segmentation [152.609322951917]
We come up with a Kullback-Leibler (KL) divergence-based Fuzzy C-Means (FCM) algorithm by incorporating a tight wavelet frame transform and a morphological reconstruction operation.
The proposed algorithm works well and comes with better segmentation performance than other comparative algorithms.
arXiv Detail & Related papers (2020-02-21T05:19:10Z) - Residual-Sparse Fuzzy $C$-Means Clustering Incorporating Morphological
Reconstruction and Wavelet frames [146.63177174491082]
Fuzzy $C$-Means (FCM) algorithm incorporates a morphological reconstruction operation and a tight wavelet frame transform.
We present an improved FCM algorithm by imposing an $ell_0$ regularization term on the residual between the feature set and its ideal value.
Experimental results reported for synthetic, medical, and color images show that the proposed algorithm is effective and efficient, and outperforms other algorithms.
arXiv Detail & Related papers (2020-02-14T10:00:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.