Cut-and-Paste Object Insertion by Enabling Deep Image Prior for
Reshading
- URL: http://arxiv.org/abs/2010.05907v2
- Date: Tue, 13 Sep 2022 17:58:43 GMT
- Title: Cut-and-Paste Object Insertion by Enabling Deep Image Prior for
Reshading
- Authors: Anand Bhattad and David A. Forsyth
- Abstract summary: We show how to insert an object from one image to another and get realistic results in the hard case.
We introduce a method that corrects shading inconsistencies of the inserted object without requiring a geometric and physical model.
- Score: 12.710258955529705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show how to insert an object from one image to another and get realistic
results in the hard case, where the shading of the inserted object clashes with
the shading of the scene. Rendering objects using an illumination model of the
scene doesn't work, because doing so requires a geometric and material model of
the object, which is hard to recover from a single image. In this paper, we
introduce a method that corrects shading inconsistencies of the inserted object
without requiring a geometric and physical model or an environment map. Our
method uses a deep image prior (DIP), trained to produce reshaded renderings of
inserted objects via consistent image decomposition inferential losses. The
resulting image from DIP aims to have (a) an albedo similar to the
cut-and-paste albedo, (b) a similar shading field to that of the target scene,
and (c) a shading that is consistent with the cut-and-paste surface normals.
The result is a simple procedure that produces convincing shading of the
inserted object. We show the efficacy of our method both qualitatively and
quantitatively for several objects with complex surface properties and also on
a dataset of spherical lampshades for quantitative evaluation. Our method
significantly outperforms an Image Harmonization (IH) baseline for all these
objects. They also outperform the cut-and-paste and IH baselines in a user
study with over 100 users.
Related papers
- Floating No More: Object-Ground Reconstruction from a Single Image [33.34421517827975]
We introduce ORG (Object Reconstruction with Ground), a novel task aimed at reconstructing 3D object geometry in conjunction with the ground surface.
Our method uses two compact pixel-level representations to depict the relationship between camera, object, and ground.
arXiv Detail & Related papers (2024-07-26T17:59:56Z) - Diff-DOPE: Differentiable Deep Object Pose Estimation [29.703385848843414]
We introduce Diff-DOPE, a 6-DoF pose refiner that takes as input an image, a 3D textured model of an object, and an initial pose of the object.
The method uses differentiable rendering to update the object pose to minimize the visual error between the image and the projection of the model.
We show that this simple, yet effective, idea is able to achieve state-of-the-art results on pose estimation datasets.
arXiv Detail & Related papers (2023-09-30T18:52:57Z) - ObjectStitch: Generative Object Compositing [43.206123360578665]
We propose a self-supervised framework for object compositing using conditional diffusion models.
Our framework can transform the viewpoint, geometry, color and shadow of the generated object while requiring no manual labeling.
Our method outperforms relevant baselines in both realism and faithfulness of the synthesized result images in a user study on various real-world images.
arXiv Detail & Related papers (2022-12-02T02:15:13Z) - Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose
Estimation [44.8872454995923]
We present a novel approach for scalable 6D pose estimation, by self-supervised learning on synthetic data of multiple objects using a single autoencoder.
We test our method on two multi-object benchmarks with real data, T-LESS and NOCS REAL275, and show it outperforms existing RGB-based methods in terms of pose estimation accuracy and generalization.
arXiv Detail & Related papers (2021-07-27T01:55:30Z) - Sparse Pose Trajectory Completion [87.31270669154452]
We propose a method to learn, even using a dataset where objects appear only in sparsely sampled views.
This is achieved with a cross-modal pose trajectory transfer mechanism.
Our method is evaluated on the Pix3D and ShapeNet datasets.
arXiv Detail & Related papers (2021-05-01T00:07:21Z) - Holistic 3D Scene Understanding from a Single Image with Implicit
Representation [112.40630836979273]
We present a new pipeline for holistic 3D scene understanding from a single image.
We propose an image-based local structured implicit network to improve the object shape estimation.
We also refine 3D object pose and scene layout via a novel implicit scene graph neural network.
arXiv Detail & Related papers (2021-03-11T02:52:46Z) - ShaRF: Shape-conditioned Radiance Fields from a Single View [54.39347002226309]
We present a method for estimating neural scenes representations of objects given only a single image.
The core of our method is the estimation of a geometric scaffold for the object.
We demonstrate in several experiments the effectiveness of our approach in both synthetic and real images.
arXiv Detail & Related papers (2021-02-17T16:40:28Z) - Continuous Surface Embeddings [76.86259029442624]
We focus on the task of learning and representing dense correspondences in deformable object categories.
We propose a new, learnable image-based representation of dense correspondences.
We demonstrate that the proposed approach performs on par or better than the state-of-the-art methods for dense pose estimation for humans.
arXiv Detail & Related papers (2020-11-24T22:52:15Z) - Category Level Object Pose Estimation via Neural Analysis-by-Synthesis [64.14028598360741]
In this paper we combine a gradient-based fitting procedure with a parametric neural image synthesis module.
The image synthesis network is designed to efficiently span the pose configuration space.
We experimentally show that the method can recover orientation of objects with high accuracy from 2D images alone.
arXiv Detail & Related papers (2020-08-18T20:30:47Z) - Leveraging Photometric Consistency over Time for Sparsely Supervised
Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video.
Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.