Learning Object Placement via Dual-path Graph Completion
- URL: http://arxiv.org/abs/2207.11464v1
- Date: Sat, 23 Jul 2022 08:39:39 GMT
- Title: Learning Object Placement via Dual-path Graph Completion
- Authors: Siyuan Zhou and Liu Liu and Li Niu and Liqing Zhang
- Abstract summary: Object placement aims to place a foreground object over a background image with a suitable location and size.
In this work, we treat object placement as a graph completion problem and propose a novel graph completion module (GCM)
The foreground object is encoded as a special node that should be inserted at a reasonable place in this graph.
- Score: 28.346027247882354
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Object placement aims to place a foreground object over a background image
with a suitable location and size. In this work, we treat object placement as a
graph completion problem and propose a novel graph completion module (GCM). The
background scene is represented by a graph with multiple nodes at different
spatial locations with various receptive fields. The foreground object is
encoded as a special node that should be inserted at a reasonable place in this
graph. We also design a dual-path framework upon the structure of GCM to fully
exploit annotated composite images. With extensive experiments on OPA dataset,
our method proves to significantly outperform existing methods in generating
plausible object placement without loss of diversity.
Related papers
- Open-Vocabulary Octree-Graph for 3D Scene Understanding [54.11828083068082]
Octree-Graph is a novel scene representation for open-vocabulary 3D scene understanding.
An adaptive-octree structure is developed that stores semantics and depicts the occupancy of an object adjustably according to its shape.
arXiv Detail & Related papers (2024-11-25T10:14:10Z) - Multiview Scene Graph [7.460438046915524]
A proper scene representation is central to the pursuit of spatial intelligence.
We propose to build Multiview Scene Graphs (MSG) from unposed images.
MSG represents a scene topologically with interconnected place and object nodes.
arXiv Detail & Related papers (2024-10-15T02:04:05Z) - SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs [81.2396059480232]
SceneGraphLoc learns a fixed-sized embedding for each node (i.e., representing an object instance) in the scene graph.
When images are leveraged, SceneGraphLoc achieves performance close to that of state-of-the-art techniques depending on large image databases.
arXiv Detail & Related papers (2024-03-30T20:25:16Z) - Grounding Scene Graphs on Natural Images via Visio-Lingual Message
Passing [17.63475613154152]
This paper presents a framework for jointly grounding objects that follow certain semantic relationship constraints in a scene graph.
A scene graph is an efficient and structured way to represent all the objects and their semantic relationships in the image.
arXiv Detail & Related papers (2022-11-03T16:46:46Z) - Leveraging commonsense for object localisation in partial scenes [36.47035776975184]
We propose a novel scene representation to facilitate the geometric reasoning, Directed Spatial Commonsense Graph (D-SCG)
We estimate the unknown position of the target object using a Graph Neural Network that implements a novel attentional message passing mechanism.
We evaluate our method using Partial ScanNet, improving the state-of-the-art by 5.9% in terms of the localisation accuracy at a 8x faster training speed.
arXiv Detail & Related papers (2022-11-01T16:17:07Z) - Segmentation-grounded Scene Graph Generation [47.34166260639392]
We propose a framework for pixel-level segmentation-grounded scene graph generation.
Our framework is agnostic to the underlying scene graph generation method.
It is learned in a multi-task manner with both target and auxiliary datasets.
arXiv Detail & Related papers (2021-04-29T08:54:08Z) - Learning Spatial Context with Graph Neural Network for Multi-Person Pose
Grouping [71.59494156155309]
Bottom-up approaches for image-based multi-person pose estimation consist of two stages: keypoint detection and grouping.
In this work, we formulate the grouping task as a graph partitioning problem, where we learn the affinity matrix with a Graph Neural Network (GNN)
The learned geometry-based affinity is further fused with appearance-based affinity to achieve robust keypoint association.
arXiv Detail & Related papers (2021-04-06T09:21:14Z) - Scene Graph to Image Generation with Contextualized Object Layout
Refinement [92.85331019618332]
We propose a novel method to generate images from scene graphs.
Our approach improves the layout coverage by almost 20 points and drops object overlap to negligible amounts.
arXiv Detail & Related papers (2020-09-23T06:27:54Z) - Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects.
Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity.
We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.