Hypercomplex Image-to-Image Translation
- URL: http://arxiv.org/abs/2205.02087v1
- Date: Wed, 4 May 2022 14:28:50 GMT
- Title: Hypercomplex Image-to-Image Translation
- Authors: Eleonora Grassucci, Luigi Sigillo, Aurelio Uncini, Danilo Comminiello
- Abstract summary: Image-to-image translation (I2I) aims at transferring the content representation from an input domain to an output one.
Recent I2I generative models, which gain outstanding results in this task, comprise a set of diverse deep networks each with tens of million parameters.
We propose to leverage hypercomplex algebra properties to define lightweight I2I generative models capable of preserving pre-existing relations among image dimensions.
- Score: 13.483068375377362
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image-to-image translation (I2I) aims at transferring the content
representation from an input domain to an output one, bouncing along different
target domains. Recent I2I generative models, which gain outstanding results in
this task, comprise a set of diverse deep networks each with tens of million
parameters. Moreover, images are usually three-dimensional being composed of
RGB channels and common neural models do not take dimensions correlation into
account, losing beneficial information. In this paper, we propose to leverage
hypercomplex algebra properties to define lightweight I2I generative models
capable of preserving pre-existing relations among image dimensions, thus
exploiting additional input information. On manifold I2I benchmarks, we show
how the proposed Quaternion StarGANv2 and parameterized hypercomplex StarGANv2
(PHStarGANv2) reduce parameters and storage memory amount while ensuring high
domain translation performance and good image quality as measured by FID and
LPIPS scores. Full code is available at: https://github.com/ispamm/HI2I.
Related papers
- OminiControl: Minimal and Universal Control for Diffusion Transformer [68.3243031301164]
OminiControl is a framework that integrates image conditions into pre-trained Diffusion Transformer (DiT) models.
At its core, OminiControl leverages a parameter reuse mechanism, enabling the DiT to encode image conditions using itself as a powerful backbone.
OminiControl addresses a wide range of image conditioning tasks in a unified manner, including subject-driven generation and spatially-aligned conditions.
arXiv Detail & Related papers (2024-11-22T17:55:15Z) - Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation.
Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack.
General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors.
We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z) - The R2D2 deep neural network series paradigm for fast precision imaging in radio astronomy [1.7249361224827533]
Recent image reconstruction techniques have remarkable capability for imaging precision, well beyond CLEAN's capability.
We introduce a novel deep learning approach, dubbed "Residual-to-Residual DNN series for high-Dynamic range imaging"
R2D2's capability to deliver high precision is demonstrated in simulation, across a variety image observation settings using the Very Large Array (VLA)
arXiv Detail & Related papers (2024-03-08T16:57:54Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Dual Aggregation Transformer for Image Super-Resolution [92.41781921611646]
We propose a novel Transformer model, Dual Aggregation Transformer, for image SR.
Our DAT aggregates features across spatial and channel dimensions, in the inter-block and intra-block dual manner.
Our experiments show that our DAT surpasses current methods.
arXiv Detail & Related papers (2023-08-07T07:39:39Z) - Deep Axial Hypercomplex Networks [1.370633147306388]
Recent works make it possible to improve representational capabilities by using hypercomplex-inspired networks.
This paper reduces this cost by factorizing a quaternion 2D convolutional module into two consecutive vectormap 1D convolutional modules.
Incorporating both yields our proposed hypercomplex network, a novel architecture that can be assembled to construct deep axial-hypercomplex networks.
arXiv Detail & Related papers (2023-01-11T18:31:00Z) - A Dual Neighborhood Hypergraph Neural Network for Change Detection in
VHR Remote Sensing Images [12.222830717774118]
A dual neighborhood hypergraph neural network is proposed in this article.
The proposed method comprises better effectiveness and robustness compared to many state-of-the-art methods.
arXiv Detail & Related papers (2022-02-27T02:39:08Z) - Adversarial Generation of Continuous Images [31.92891885615843]
In this paper, we propose two novel architectural techniques for building INR-based image decoders.
We use them to build a state-of-the-art continuous image GAN.
Our proposed INR-GAN architecture improves the performance of continuous image generators by several times.
arXiv Detail & Related papers (2020-11-24T11:06:40Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image.
We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.