Contour-guided Image Completion with Perceptual Grouping
- URL: http://arxiv.org/abs/2111.11322v1
- Date: Mon, 22 Nov 2021 16:26:25 GMT
- Title: Contour-guided Image Completion with Perceptual Grouping
- Authors: Morteza Rezanejad, Sidharth Gupta, Chandra Gummaluru, Ryan Marten,
John Wilder, Michael Gruninger, Dirk B. Walther
- Abstract summary: This paper implements a modernized model of the Completion Fields (SCF) algorithm.
We show how the SCF algorithm mimics results in human perception.
We use the SCF completed contours as guides for inpainting, and show that our guides improve the performance of state-of-the-art models.
- Score: 7.588025965572449
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Humans are excellent at perceiving illusory outlines. We are readily able to
complete contours, shapes, scenes, and even unseen objects when provided with
images that contain broken fragments of a connected appearance. In vision
science, this ability is largely explained by perceptual grouping: a
foundational set of processes in human vision that describes how separated
elements can be grouped. In this paper, we revisit an algorithm called
Stochastic Completion Fields (SCFs) that mechanizes a set of such processes --
good continuity, closure, and proximity -- through contour completion. This
paper implements a modernized model of the SCF algorithm, and uses it in an
image editing framework where we propose novel methods to complete fragmented
contours. We show how the SCF algorithm plausibly mimics results in human
perception. We use the SCF completed contours as guides for inpainting, and
show that our guides improve the performance of state-of-the-art models.
Additionally, we show that the SCF aids in finding edges in high-noise
environments. Overall, our described algorithms resemble an important mechanism
in the human visual system, and offer a novel framework that modern computer
vision models can benefit from.
Related papers
- "Principal Components" Enable A New Language of Images [79.45806370905775]
We introduce a novel visual tokenization framework that embeds a provable PCA-like structure into the latent token space.
Our approach achieves state-of-the-art reconstruction performance and enables better interpretability to align with the human vision system.
arXiv Detail & Related papers (2025-03-11T17:59:41Z) - Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis.
We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data.
FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z) - Deep Medial Voxels: Learned Medial Axis Approximations for Anatomical Shape Modeling [5.584193645582203]
We introduce deep medial voxels, a semi-implicit representation that faithfully approximates the topological skeleton from imaging volumes.
Our reconstruction technique shows potential for both visualization and computer simulations.
arXiv Detail & Related papers (2024-03-18T13:47:18Z) - Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis [65.7968515029306]
We propose a novel Coarse-to-Fine Latent Diffusion (CFLD) method for Pose-Guided Person Image Synthesis (PGPIS)
A perception-refined decoder is designed to progressively refine a set of learnable queries and extract semantic understanding of person images as a coarse-grained prompt.
arXiv Detail & Related papers (2024-02-28T06:07:07Z) - UpFusion: Novel View Diffusion from Unposed Sparse View Observations [66.36092764694502]
UpFusion can perform novel view synthesis and infer 3D representations for an object given a sparse set of reference images.
We show that this mechanism allows generating high-fidelity novel views while improving the synthesis quality given additional (unposed) images.
arXiv Detail & Related papers (2023-12-11T18:59:55Z) - Neural Congealing: Aligning Images to a Joint Semantic Atlas [14.348512536556413]
We present a zero-shot self-supervised framework for aligning semantically-common content across a set of images.
Our approach harnesses the power of pre-trained DINO-ViT features to learn.
We show that our method performs favorably compared to a state-of-the-art method that requires extensive training on large-scale datasets.
arXiv Detail & Related papers (2023-02-08T09:26:22Z) - Semantic keypoint-based pose estimation from single RGB frames [64.80395521735463]
We present an approach to estimating the continuous 6-DoF pose of an object from a single RGB image.
The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model.
We show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios.
arXiv Detail & Related papers (2022-04-12T15:03:51Z) - Cycle-Consistent Counterfactuals by Latent Transformations [5.254093731341154]
Cycle-Consistent Counterfactuals by Latent Transformations (C3LT) learns a latent transformation that automatically generates visuals by steering in the latent space of generative models.
C3LT can be easily plugged into any state-of-the-art pretrained generative network.
In addition to several established metrics for evaluating CF explanations, we introduce a novel metric tailored to assess the quality of the generated CF examples.
arXiv Detail & Related papers (2022-03-28T20:10:09Z) - ViCE: Self-Supervised Visual Concept Embeddings as Contextual and Pixel
Appearance Invariant Semantic Representations [77.3590853897664]
This work presents a self-supervised method to learn dense semantically rich visual embeddings for images inspired by methods for learning word embeddings in NLP.
arXiv Detail & Related papers (2021-11-24T12:27:30Z) - Image Matching across Wide Baselines: From Paper to Practice [80.9424750998559]
We introduce a comprehensive benchmark for local features and robust estimation algorithms.
Our pipeline's modular structure allows easy integration, configuration, and combination of different methods.
We show that with proper settings, classical solutions may still outperform the perceived state of the art.
arXiv Detail & Related papers (2020-03-03T15:20:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.