SAIR: Learning Semantic-aware Implicit Representation
- URL: http://arxiv.org/abs/2310.09285v1
- Date: Fri, 13 Oct 2023 17:52:16 GMT
- Title: SAIR: Learning Semantic-aware Implicit Representation
- Authors: Canyu Zhang, Xiaoguang Li, Qing Guo, Song Wang
- Abstract summary: Implicit representation of an image can map arbitrary coordinates in the continuous domain to their corresponding color values.
Existing implicit representation approaches only focus on building continuous appearance mapping.
We learn semantic-aware implicit representation (SAIR), that is, we make the implicit representation of each pixel rely on both its appearance and semantic information.
- Score: 23.842761556556216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Implicit representation of an image can map arbitrary coordinates in the
continuous domain to their corresponding color values, presenting a powerful
capability for image reconstruction. Nevertheless, existing implicit
representation approaches only focus on building continuous appearance mapping,
ignoring the continuities of the semantic information across pixels. As a
result, they can hardly achieve desired reconstruction results when the
semantic information within input images is corrupted, for example, a large
region misses. To address the issue, we propose to learn semantic-aware
implicit representation (SAIR), that is, we make the implicit representation of
each pixel rely on both its appearance and semantic information (\eg, which
object does the pixel belong to). To this end, we propose a framework with two
modules: (1) building a semantic implicit representation (SIR) for a corrupted
image whose large regions miss. Given an arbitrary coordinate in the continuous
domain, we can obtain its respective text-aligned embedding indicating the
object the pixel belongs. (2) building an appearance implicit representation
(AIR) based on the SIR. Given an arbitrary coordinate in the continuous domain,
we can reconstruct its color whether or not the pixel is missed in the input.
We validate the novel semantic-aware implicit representation method on the
image inpainting task, and the extensive experiments demonstrate that our
method surpasses state-of-the-art approaches by a significant margin.
Related papers
- Improving Text-guided Object Inpainting with Semantic Pre-inpainting [95.17396565347936]
We decompose the typical single-stage object inpainting into two cascaded processes: semantic pre-inpainting and high-fieldity object generation.
To achieve this, we cascade a Transformer-based semantic inpainter and an object inpainting diffusion model, leading to a novel CAscaded Transformer-Diffusion framework.
arXiv Detail & Related papers (2024-09-12T17:55:37Z) - GP-NeRF: Generalized Perception NeRF for Context-Aware 3D Scene Understanding [101.32590239809113]
Generalized Perception NeRF (GP-NeRF) is a novel pipeline that makes the widely used segmentation model and NeRF work compatibly under a unified framework.
We propose two self-distillation mechanisms, i.e., the Semantic Distill Loss and the Depth-Guided Semantic Distill Loss, to enhance the discrimination and quality of the semantic field.
arXiv Detail & Related papers (2023-11-20T15:59:41Z) - SuperInpaint: Learning Detail-Enhanced Attentional Implicit
Representation for Super-resolutional Image Inpainting [26.309834304515544]
We introduce a challenging image restoration task, referred to as SuperInpaint.
This task aims to reconstruct missing regions in low-resolution images and generate completed images with arbitrarily higher resolutions.
We propose the detail-enhanced attentional implicit representation that can achieve SuperInpaint with a single model.
arXiv Detail & Related papers (2023-07-26T20:28:58Z) - Single Image Super-Resolution via a Dual Interactive Implicit Neural
Network [5.331665215168209]
We introduce a novel implicit neural network for the task of single image super-resolution at arbitrary scale factors.
We demonstrate the efficacy and flexibility of our approach against the state of the art on publicly available benchmark datasets.
arXiv Detail & Related papers (2022-10-23T02:05:19Z) - Boosting Image Outpainting with Semantic Layout Prediction [18.819765707811904]
We train a GAN to extend regions in semantic segmentation domain instead of image domain.
Another GAN model is trained to synthesize real images based on the extended semantic layouts.
Our approach can handle semantic clues more easily and hence works better in complex scenarios.
arXiv Detail & Related papers (2021-10-18T13:09:31Z) - Dense Semantic Contrast for Self-Supervised Visual Representation
Learning [12.636783522731392]
We present Dense Semantic Contrast (DSC) for modeling semantic category decision boundaries at a dense level.
We propose a dense cross-image semantic contrastive learning framework for multi-granularity representation learning.
Experimental results show that our DSC model outperforms state-of-the-art methods when transferring to downstream dense prediction tasks.
arXiv Detail & Related papers (2021-09-16T07:04:05Z) - BoundarySqueeze: Image Segmentation as Boundary Squeezing [104.43159799559464]
We propose a novel method for fine-grained high-quality image segmentation of both objects and scenes.
Inspired by dilation and erosion from morphological image processing techniques, we treat the pixel level segmentation problems as squeezing object boundary.
Our method yields large gains on COCO, Cityscapes, for both instance and semantic segmentation and outperforms previous state-of-the-art PointRend in both accuracy and speed under the same setting.
arXiv Detail & Related papers (2021-05-25T04:58:51Z) - AINet: Association Implantation for Superpixel Segmentation [82.21559299694555]
We propose a novel textbfAssociation textbfImplantation (AI) module to enable the network to explicitly capture the relations between the pixel and its surrounding grids.
Our method could not only achieve state-of-the-art performance but maintain satisfactory inference efficiency.
arXiv Detail & Related papers (2021-01-26T10:40:13Z) - Learning Continuous Image Representation with Local Implicit Image
Function [21.27344998709831]
We propose LIIF representation, which takes an image coordinate and the 2D deep features around the coordinate as inputs, predicts the RGB value at a given coordinate as an output.
To generate the continuous representation for images, we train an encoder with LIIF representation via a self-supervised task with super-resolution.
The learned continuous representation can be presented in arbitrary resolution even extrapolate to x30 higher resolution.
arXiv Detail & Related papers (2020-12-16T18:56:50Z) - Seed the Views: Hierarchical Semantic Alignment for Contrastive
Representation Learning [116.91819311885166]
We propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to textbfCross-samples and Multi-level representation.
Our method, termed as CsMl, has the ability to integrate multi-level visual representations across samples in a robust way.
arXiv Detail & Related papers (2020-12-04T17:26:24Z) - Cross-domain Correspondence Learning for Exemplar-based Image
Translation [59.35767271091425]
We present a framework for exemplar-based image translation, which synthesizes a photo-realistic image from the input in a distinct domain.
The output has the style (e.g., color, texture) in consistency with the semantically corresponding objects in the exemplar.
We show that our method is superior to state-of-the-art methods in terms of image quality significantly.
arXiv Detail & Related papers (2020-04-12T09:10:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.