SCSNet: An Efficient Paradigm for Learning Simultaneously Image
Colorization and Super-Resolution
- URL: http://arxiv.org/abs/2201.04364v1
- Date: Wed, 12 Jan 2022 08:59:12 GMT
- Title: SCSNet: An Efficient Paradigm for Learning Simultaneously Image
Colorization and Super-Resolution
- Authors: Jiangning Zhang, Chao Xu, Jian Li, Yue Han, Yabiao Wang, Ying Tai and
Yong Liu
- Abstract summary: We present an efficient paradigm to perform Simultaneously Image Colorization and Super-resolution (SCS)
The proposed method consists of two parts: colorization branch for learning color information that employs the proposed plug-and-play emphPyramid Valve Cross Attention (PVCAttn) module.
Our SCSNet supports both automatic and referential modes that is more flexible for practical application.
- Score: 39.77987463287673
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the practical application of restoring low-resolution gray-scale images,
we generally need to run three separate processes of image colorization,
super-resolution, and dows-sampling operation for the target device. However,
this pipeline is redundant and inefficient for the independent processes, and
some inner features could have been shared. Therefore, we present an efficient
paradigm to perform {S}imultaneously Image {C}olorization and
{S}uper-resolution (SCS) and propose an end-to-end SCSNet to achieve this goal.
The proposed method consists of two parts: colorization branch for learning
color information that employs the proposed plug-and-play \emph{Pyramid Valve
Cross Attention} (PVCAttn) module to aggregate feature maps between source and
reference images; and super-resolution branch for integrating color and texture
information to predict target images, which uses the designed \emph{Continuous
Pixel Mapping} (CPM) module to predict high-resolution images at continuous
magnification. Furthermore, our SCSNet supports both automatic and referential
modes that is more flexible for practical application. Abundant experiments
demonstrate the superiority of our method for generating authentic images over
state-of-the-art methods, e.g., averagely decreasing FID by 1.8$\downarrow$ and
5.1 $\downarrow$ compared with current best scores for automatic and
referential modes, respectively, while owning fewer parameters (more than
$\times$2$\downarrow$) and faster running speed (more than
$\times$3$\uparrow$).
Related papers
- Pixel Adapter: A Graph-Based Post-Processing Approach for Scene Text
Image Super-Resolution [22.60056946339325]
We propose the Pixel Adapter Module (PAM) based on graph attention to address pixel distortion caused by upsampling.
The PAM effectively captures local structural information by allowing each pixel to interact with its neighbors and update features.
We demonstrate that our proposed method generates high-quality super-resolution images, surpassing existing methods in recognition accuracy.
arXiv Detail & Related papers (2023-09-16T08:12:12Z) - Raising The Limit Of Image Rescaling Using Auxiliary Encoding [7.9700865143145485]
Recently, image rescaling models like IRN utilize the bidirectional nature of INN to push the performance limit of image upscaling.
We propose auxiliary encoding modules to further push the limit of image rescaling performance.
arXiv Detail & Related papers (2023-03-12T20:49:07Z) - Spatially-Adaptive Feature Modulation for Efficient Image
Super-Resolution [90.16462805389943]
We develop a spatially-adaptive feature modulation (SAFM) mechanism upon a vision transformer (ViT)-like block.
Proposed method is $3times$ smaller than state-of-the-art efficient SR methods.
arXiv Detail & Related papers (2023-02-27T14:19:31Z) - Adaptive Local Implicit Image Function for Arbitrary-scale
Super-resolution [61.95533972380704]
Local implicit image function (LIIF) denotes images as a continuous function where pixel values are expansion by using the corresponding coordinates as inputs.
LIIF can be adopted for arbitrary-scale image super-resolution tasks, resulting in a single effective and efficient model for various up-scaling factors.
We propose a novel adaptive local image function (A-LIIF) to alleviate this problem.
arXiv Detail & Related papers (2022-08-07T11:23:23Z) - Curvature-guided dynamic scale networks for Multi-view Stereo [10.667165962654996]
This paper focuses on learning a robust feature extraction network to enhance the performance of matching costs without heavy computation.
We present a dynamic scale feature extraction network, namely, CDSFNet.
It is composed of multiple novel convolution layers, each of which can select a proper patch scale for each pixel guided by the normal curvature of the image surface.
arXiv Detail & Related papers (2021-12-11T14:41:05Z) - Saliency Enhancement using Superpixel Similarity [77.34726150561087]
Saliency Object Detection (SOD) has several applications in image analysis.
Deep-learning-based SOD methods are among the most effective, but they may miss foreground parts with similar colors.
We introduce a post-processing method, named textitSaliency Enhancement over Superpixel Similarity (SESS)
We demonstrate that SESS can consistently and considerably improve the results of three deep-learning-based SOD methods on five image datasets.
arXiv Detail & Related papers (2021-12-01T17:22:54Z) - Universal Deep Network for Steganalysis of Color Image based on Channel
Representation [41.86330101334733]
We design a universal color image steganalysis network (called UCNet) in spatial and JPEG domains.
The proposed method includes preprocessing, convolutional, and classification modules.
We conduct extensive experiments on ALASKA II to demonstrate that the proposed method can achieve state-of-the-art results.
arXiv Detail & Related papers (2021-11-24T02:22:13Z) - Spatial-Separated Curve Rendering Network for Efficient and
High-Resolution Image Harmonization [59.19214040221055]
We propose a novel spatial-separated curve rendering network (S$2$CRNet) for efficient and high-resolution image harmonization.
The proposed method reduces more than 90% parameters compared with previous methods.
Our method can work smoothly on higher resolution images in real-time which is more than 10$times$ faster than the existing methods.
arXiv Detail & Related papers (2021-09-13T07:20:16Z) - Self-Supervised Representation Learning for RGB-D Salient Object
Detection [93.17479956795862]
We use Self-Supervised Representation Learning to design two pretext tasks: the cross-modal auto-encoder and the depth-contour estimation.
Our pretext tasks require only a few and un RGB-D datasets to perform pre-training, which make the network capture rich semantic contexts.
For the inherent problem of cross-modal fusion in RGB-D SOD, we propose a multi-path fusion module.
arXiv Detail & Related papers (2021-01-29T09:16:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.