Related papers: SDA-GAN: Unsupervised Image Translation Using Spectral Domain Attention-Guided Generative Adversarial Network

Related papers

Plasticine: A Traceable Diffusion Model for Medical Image Translation [79.39689106440389]
We propose Plasticine, to the best of our knowledge, the first end-to-end image-to-image translation framework explicitly designed with traceability as a core objective.<n>Our method combines intensity translation and spatial transformation within a denoising diffusion framework.<n>This design enables the generation of synthetic images with interpretable intensity transitions and spatially coherent deformations, supporting pixel-wise traceability throughout the translation process.
arXiv Detail & Related papers (2025-12-20T18:01:57Z)
GCRPNet: Graph-Enhanced Contextual and Regional Perception Network for Salient Object Detection in Optical Remote Sensing Images [68.33481681452675]
We propose a graph-enhanced contextual and regional perception network (GCRPNet)<n>It builds upon the Mamba architecture to simultaneously capture long-range dependencies and enhance regional feature representation.<n>It performs adaptive patch scanning on feature maps processed via multi-scale convolutions, thereby capturing rich local region information.
arXiv Detail & Related papers (2025-08-14T11:31:43Z)
Exploring Kernel Transformations for Implicit Neural Representations [57.2225355625268]
Implicit neural representations (INRs) leverage neural networks to represent signals by mapping coordinates to their corresponding attributes. This work pioneers the exploration of the effect of kernel transformation of input/output while keeping the model itself unchanged. A byproduct of our findings is a simple yet effective method that combines scale and shift to significantly boost INR with negligible overhead.
arXiv Detail & Related papers (2025-04-07T04:43:50Z)
FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation [50.9040167152168]
We experimentally quantify the contrast sensitivity function of CNNs and compare it with that of the human visual system. We propose the Wavelet-Guided Spectral Pooling Module (WSPM) to enhance and balance image features across the frequency domain. To further emulate the human visual system, we introduce the Frequency Domain Enhanced Receptive Field Block (FE-RFB) We develop FE-UNet, a model that utilizes SAM2 as its backbone and incorporates Hiera-Large as a pre-trained block.
arXiv Detail & Related papers (2025-02-06T07:24:34Z)
IPT-V2: Efficient Image Processing Transformer using Hierarchical Attentions [26.09373405194564]
We present an efficient image processing transformer architecture with hierarchical attentions, called IPTV2. We adopt a focal context self-attention (FCSA) and a global grid self-attention (GGSA) to obtain adequate token interactions in local and global receptive fields. Our proposed IPT-V2 achieves state-of-the-art results on various image processing tasks, covering denoising, deblurring, deraining and obtains much better trade-off for performance and computational complexity than previous methods.
arXiv Detail & Related papers (2024-03-31T10:01:20Z)
Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile [15.5188527312094]
We propose a framework to mitigate the disparity in frequency domain of the generated images. This is realized by spectrum translation for the refinement of image generation (STIG) based on contrastive learning. We evaluate our framework across eight fake image datasets and various cutting-edge models to demonstrate the effectiveness of STIG.
arXiv Detail & Related papers (2024-03-08T06:39:24Z)
In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model. We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z)
ESSAformer: Efficient Transformer for Hyperspectral Image Super-resolution [76.7408734079706]
Single hyperspectral image super-resolution (single-HSI-SR) aims to restore a high-resolution hyperspectral image from a low-resolution observation. We propose ESSAformer, an ESSA attention-embedded Transformer network for single-HSI-SR with an iterative refining structure.
arXiv Detail & Related papers (2023-07-26T07:45:14Z)
Spectral Normalization and Dual Contrastive Regularization for Image-to-Image Translation [9.029227024451506]
We propose a new unpaired I2I translation framework based on dual contrastive regularization and spectral normalization. We conduct comprehensive experiments to evaluate the effectiveness of SN-DCR, and the results prove that our method achieves SOTA in multiple tasks.
arXiv Detail & Related papers (2023-04-22T05:22:24Z)
TcGAN: Semantic-Aware and Structure-Preserved GANs with Individual Vision Transformer for Fast Arbitrary One-Shot Image Generation [11.207512995742999]
One-shot image generation (OSG) with generative adversarial networks that learn from the internal patches of a given image has attracted world wide attention. We propose a novel structure-preserved method TcGAN with individual vision transformer to overcome the shortcomings of the existing one-shot image generation methods.
arXiv Detail & Related papers (2023-02-16T03:05:59Z)
Demystify Transformers & Convolutions in Modern Image Deep Networks [82.32018252867277]
This paper aims to identify the real gains of popular convolution and attention operators through a detailed study. We find that the key difference among these feature transformation modules, such as attention or convolution, lies in their spatial feature aggregation approach. Our experiments on various tasks and an analysis of inductive bias show a significant performance boost due to advanced network-level and block-level designs.
arXiv Detail & Related papers (2022-11-10T18:59:43Z)
Deep Fourier Up-Sampling [100.59885545206744]
Up-sampling in the Fourier domain is more challenging as it does not follow such a local property. We propose a theoretically sound Deep Fourier Up-Sampling (FourierUp) to solve these issues.
arXiv Detail & Related papers (2022-10-11T06:17:31Z)
Transformer-based SAR Image Despeckling [53.99620005035804]
We introduce a transformer-based network for SAR image despeckling. The proposed despeckling network comprises of a transformer-based encoder which allows the network to learn global dependencies between different image regions. Experiments show that the proposed method achieves significant improvements over traditional and convolutional neural network-based despeckling methods.
arXiv Detail & Related papers (2022-01-23T20:09:01Z)
Investigating Expressiveness of Transformer in Spectral Domain for Graphs [6.092217185687028]
We study and prove the link between the spatial and spectral domain in the realm of the transformer. We propose FeTA, a framework that aims to perform attention over the entire graph spectrum analogous to the attention in spatial space.
arXiv Detail & Related papers (2022-01-23T18:03:22Z)
SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition [48.676064155070556]
Arbitrary text appearance poses a great challenge in scene text recognition tasks. We introduce a new learnable geometric-unrelated module, the Structure-Preserving Inner Offset Network (SPIN) SPIN allows the color manipulation of source data within the network.
arXiv Detail & Related papers (2020-05-27T01:47:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.