Related papers: Increasing diversity of omni-directional images generated from single image using cGAN based on MLPMixer

Increasing diversity of omni-directional images generated from single image using cGAN based on MLPMixer

URL: http://arxiv.org/abs/2309.08129v1
Date: Fri, 15 Sep 2023 03:43:29 GMT
Title: Increasing diversity of omni-directional images generated from single image using cGAN based on MLPMixer
Authors: Atsuya Nakata, Ryuto Miyazaki, Takao Yamanaka
Abstract summary: The previous method has relied on the generative adversarial networks based on convolutional neural networks (CNN) TheMixer has been proposed as an alternative to the self-attention in the transformer, which captures long-range dependencies and contextual information. As a result, competitive performance has been achieved with reduced memory consumption and computational cost.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes a novel approach to generating omni-directional images from a single snapshot picture. The previous method has relied on the generative adversarial networks based on convolutional neural networks (CNN). Although this method has successfully generated omni-directional images, CNN has two drawbacks for this task. First, since a convolutional layer only processes a local area, it is difficult to propagate the information of an input snapshot picture embedded in the center of the omni-directional image to the edges of the image. Thus, the omni-directional images created by the CNN-based generator tend to have less diversity at the edges of the generated images, creating similar scene images. Second, the CNN-based model requires large video memory in graphics processing units due to the nature of the deep structure in CNN since shallow-layer networks only receives signals from a limited range of the receptive field. To solve these problems, MLPMixer-based method was proposed in this paper. The MLPMixer has been proposed as an alternative to the self-attention in the transformer, which captures long-range dependencies and contextual information. This enables to propagate information efficiently in the omni-directional image generation task. As a result, competitive performance has been achieved with reduced memory consumption and computational cost, in addition to increasing diversity of the generated omni-directional images.

Related papers

Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view. Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks. Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z)
Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs. Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z)
Image Generation with Self Pixel-wise Normalization [17.147675335268282]
Region-adaptive normalization (RAN) methods have been widely used in the generative adversarial network (GAN)-based image-to-image translation technique. This paper presents a novel normalization method, called self pixel-wise normalization (SPN), which effectively boosts the generative performance by performing the pixel-adaptive affine transformation without the mask image.
arXiv Detail & Related papers (2022-01-26T03:14:31Z)
Restormer: Efficient Transformer for High-Resolution Image Restoration [118.9617735769827]
convolutional neural networks (CNNs) perform well at learning generalizable image priors from large-scale data. Transformers have shown significant performance gains on natural language and high-level vision tasks. Our model, named Restoration Transformer (Restormer), achieves state-of-the-art results on several image restoration tasks.
arXiv Detail & Related papers (2021-11-18T18:59:10Z)
OSLO: On-the-Sphere Learning for Omnidirectional images and its application to 360-degree image compression [59.58879331876508]
We study the learning of representation models for omnidirectional images and propose to use the properties of HEALPix uniform sampling of the sphere to redefine the mathematical tools used in deep learning models for omnidirectional images. Our proposed on-the-sphere solution leads to a better compression gain that can save 13.7% of the bit rate compared to similar learned models applied to equirectangular images.
arXiv Detail & Related papers (2021-07-19T22:14:30Z)
ResMLP: Feedforward networks for image classification with data-efficient training [73.26364887378597]
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. We will share our code based on the Timm library and pre-trained models.
arXiv Detail & Related papers (2021-05-07T17:31:44Z)
Adaptive Multiplane Image Generation from a Single Internet Picture [1.8961324344454253]
We address the problem of generating a multiplane image (MPI) from a single high-resolution picture. We propose an adaptive slicing algorithm that produces an MPI with a variable number of image planes. We show that our method is capable of producing high-quality predictions with one order of magnitude less parameters compared to previous approaches.
arXiv Detail & Related papers (2020-11-26T14:35:05Z)
Adversarial Generation of Continuous Images [31.92891885615843]
In this paper, we propose two novel architectural techniques for building INR-based image decoders. We use them to build a state-of-the-art continuous image GAN. Our proposed INR-GAN architecture improves the performance of continuous image generators by several times.
arXiv Detail & Related papers (2020-11-24T11:06:40Z)
Locally Masked Convolution for Autoregressive Models [107.4635841204146]
LMConv is a simple modification to the standard 2D convolution that allows arbitrary masks to be applied to the weights at each location in the image. We learn an ensemble of distribution estimators that share parameters but differ in generation order, achieving improved performance on whole-image density estimation.
arXiv Detail & Related papers (2020-06-22T17:59:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.