Related papers: Learning Object Placement via Dual-path Graph Completion

Learning Object Placement via Dual-path Graph Completion

URL: http://arxiv.org/abs/2207.11464v1
Date: Sat, 23 Jul 2022 08:39:39 GMT
Title: Learning Object Placement via Dual-path Graph Completion
Authors: Siyuan Zhou and Liu Liu and Li Niu and Liqing Zhang
Abstract summary: Object placement aims to place a foreground object over a background image with a suitable location and size. In this work, we treat object placement as a graph completion problem and propose a novel graph completion module (GCM) The foreground object is encoded as a special node that should be inserted at a reasonable place in this graph.
Score: 28.346027247882354
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Object placement aims to place a foreground object over a background image with a suitable location and size. In this work, we treat object placement as a graph completion problem and propose a novel graph completion module (GCM). The background scene is represented by a graph with multiple nodes at different spatial locations with various receptive fields. The foreground object is encoded as a special node that should be inserted at a reasonable place in this graph. We also design a dual-path framework upon the structure of GCM to fully exploit annotated composite images. With extensive experiments on OPA dataset, our method proves to significantly outperform existing methods in generating plausible object placement without loss of diversity.

Related papers

GeoRDF2Vec Learning Location-Aware Entity Representations in Knowledge Graphs [1.6658912537684454]
We introduce a variant of RDF2Vec that incorporates geometric information to learn location-aware embeddings of entities. Our approach expands different nodes by flooding the graph from geographic nodes, ensuring that each reachable node is considered.
arXiv Detail & Related papers (2025-04-23T21:17:31Z)
Open-Vocabulary Octree-Graph for 3D Scene Understanding [54.11828083068082]
Octree-Graph is a novel scene representation for open-vocabulary 3D scene understanding. An adaptive-octree structure is developed that stores semantics and depicts the occupancy of an object adjustably according to its shape.
arXiv Detail & Related papers (2024-11-25T10:14:10Z)
Multiview Scene Graph [7.460438046915524]
A proper scene representation is central to the pursuit of spatial intelligence. We propose to build Multiview Scene Graphs (MSG) from unposed images. MSG represents a scene topologically with interconnected place and object nodes.
arXiv Detail & Related papers (2024-10-15T02:04:05Z)
SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs [81.2396059480232]
SceneGraphLoc learns a fixed-sized embedding for each node (i.e., representing an object instance) in the scene graph. When images are leveraged, SceneGraphLoc achieves performance close to that of state-of-the-art techniques depending on large image databases.
arXiv Detail & Related papers (2024-03-30T20:25:16Z)
Grounding Scene Graphs on Natural Images via Visio-Lingual Message Passing [17.63475613154152]
This paper presents a framework for jointly grounding objects that follow certain semantic relationship constraints in a scene graph. A scene graph is an efficient and structured way to represent all the objects and their semantic relationships in the image.
arXiv Detail & Related papers (2022-11-03T16:46:46Z)
Leveraging commonsense for object localisation in partial scenes [36.47035776975184]
We propose a novel scene representation to facilitate the geometric reasoning, Directed Spatial Commonsense Graph (D-SCG) We estimate the unknown position of the target object using a Graph Neural Network that implements a novel attentional message passing mechanism. We evaluate our method using Partial ScanNet, improving the state-of-the-art by 5.9% in terms of the localisation accuracy at a 8x faster training speed.
arXiv Detail & Related papers (2022-11-01T16:17:07Z)
Segmentation-grounded Scene Graph Generation [47.34166260639392]
We propose a framework for pixel-level segmentation-grounded scene graph generation. Our framework is agnostic to the underlying scene graph generation method. It is learned in a multi-task manner with both target and auxiliary datasets.
arXiv Detail & Related papers (2021-04-29T08:54:08Z)
Learning Spatial Context with Graph Neural Network for Multi-Person Pose Grouping [71.59494156155309]
Bottom-up approaches for image-based multi-person pose estimation consist of two stages: keypoint detection and grouping. In this work, we formulate the grouping task as a graph partitioning problem, where we learn the affinity matrix with a Graph Neural Network (GNN) The learned geometry-based affinity is further fused with appearance-based affinity to achieve robust keypoint association.
arXiv Detail & Related papers (2021-04-06T09:21:14Z)
Scene Graph to Image Generation with Contextualized Object Layout Refinement [92.85331019618332]
We propose a novel method to generate images from scene graphs. Our approach improves the layout coverage by almost 20 points and drops object overlap to negligible amounts.
arXiv Detail & Related papers (2020-09-23T06:27:54Z)
Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects. Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity. We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.