RackLay: Multi-Layer Layout Estimation for Warehouse Racks
- URL: http://arxiv.org/abs/2103.09174v2
- Date: Wed, 17 Mar 2021 10:58:36 GMT
- Title: RackLay: Multi-Layer Layout Estimation for Warehouse Racks
- Authors: Meher Shashwat Nigam, Avinash Prabhu, Anurag Sahu, Puru Gupta, Tanvi
Karandikar, N. Sai Shankar, Ravi Kiran Sarvadevabhatla, K. Madhava Krishna
- Abstract summary: We present RackLay, a deep neural network for real-time shelf layout estimation from a single image.
RackLay estimates the top-view and front-view layout for each shelf in the considered rack populated with objects.
We also show that fusing the top-view and front-view enables 3D reasoning applications such as metric free space estimation for the considered rack.
- Score: 17.937062635570268
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given a monocular colour image of a warehouse rack, we aim to predict the
bird's-eye view layout for each shelf in the rack, which we term as multi-layer
layout prediction. To this end, we present RackLay, a deep neural network for
real-time shelf layout estimation from a single image. Unlike previous layout
estimation methods, which provide a single layout for the dominant ground plane
alone, RackLay estimates the top-view and front-view layout for each shelf in
the considered rack populated with objects. RackLay's architecture and its
variants are versatile and estimate accurate layouts for diverse scenes
characterized by varying number of visible shelves in an image, large range in
shelf occupancy factor and varied background clutter. Given the extreme paucity
of datasets in this space and the difficulty involved in acquiring real data
from warehouses, we additionally release a flexible synthetic dataset
generation pipeline WareSynth which allows users to control the generation
process and tailor the dataset according to contingent application. The
ablations across architectural variants and comparison with strong prior
baselines vindicate the efficacy of RackLay as an apt architecture for the
novel problem of multi-layered layout estimation. We also show that fusing the
top-view and front-view enables 3D reasoning applications such as metric free
space estimation for the considered rack.
Related papers
- Self-training Room Layout Estimation via Geometry-aware Ray-casting [27.906107629563852]
We introduce a geometry-aware self-training framework for room layout estimation models on unseen scenes with unlabeled data.
Our approach utilizes a ray-casting formulation to aggregate multiple estimates from different viewing positions.
arXiv Detail & Related papers (2024-07-21T03:25:55Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - iBARLE: imBalance-Aware Room Layout Estimation [54.819085005591894]
Room layout estimation predicts layouts from a single panorama.
There are significant imbalances in real-world datasets including the dimensions of layout complexity, camera locations, and variation in scene appearance.
We propose imBalance-Aware Room Layout Estimation (iBARLE) framework to address these issues.
iBARLE consists of (1) Appearance Variation Generation (AVG) module, (2) Complex Structure Mix-up (CSMix) module, which enhances generalizability w.r.t. room structure, and (3) a gradient-based layout objective function.
arXiv Detail & Related papers (2023-08-29T06:20:36Z) - SGAligner : 3D Scene Alignment with Scene Graphs [84.01002998166145]
Building 3D scene graphs has emerged as a topic in scene representation for several embodied AI applications.
We focus on the fundamental problem of aligning pairs of 3D scene graphs whose overlap can range from zero to partial.
We propose SGAligner, the first method for aligning pairs of 3D scene graphs that is robust to in-the-wild scenarios.
arXiv Detail & Related papers (2023-04-28T14:39:22Z) - Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation [147.81509219686419]
We propose a diagnostic benchmark for layout-guided image generation that examines four categories of spatial control skills: number, position, size, and shape.
Next, we propose IterInpaint, a new baseline that generates foreground and background regions step-by-step via inpainting.
We show comprehensive ablation studies on IterInpaint, including training task ratio, crop&paste vs. repaint, and generation order.
arXiv Detail & Related papers (2023-04-13T16:58:33Z) - MVRackLay: Monocular Multi-View Layout Estimation for Warehouse Racks
and Shelves [8.845291721126825]
MVRackLay estimates multi-layered layouts, wherein each layer corresponds to the layout of a shelf within a rack.
With minimal effort, such an output is transformed into a 3D rendering of all racks, shelves and objects on the shelves.
MVRackLay shows superior performance vis-a-vis its single view counterpart, RackLay, in layout accuracy, quantized in terms of the mean IoU and mAP metrics.
arXiv Detail & Related papers (2022-11-30T10:32:04Z) - Self-supervised 360$^{\circ}$ Room Layout Estimation [20.062713286961326]
We present the first self-supervised method to train panoramic room layout estimation models without any labeled data.
Our approach also shows promising solutions in data-scarce scenarios and active learning, which would have an immediate value in real estate virtual tour software.
arXiv Detail & Related papers (2022-03-30T04:58:07Z) - RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single
View [7.427006214471801]
We present a new approach to estimate the layout of a room from its single image.
Our approach learns an additional ranking function to estimate the final layout instead of using optimization.
Our approach shows state-of-the-art results on standard datasets with mostly cuboidal layouts and also performs well on a dataset containing rooms with non-cuboidal layouts.
arXiv Detail & Related papers (2021-10-01T20:42:49Z) - LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth
Rendering [59.63979143021241]
We formulate the task of 360 layout estimation as a problem of predicting depth on the horizon line of a panorama.
We propose the Differentiable Depth Rendering procedure to make the conversion from layout to depth prediction differentiable.
Our method achieves state-of-the-art performance on numerous 360 layout benchmark datasets.
arXiv Detail & Related papers (2021-04-01T15:48:41Z) - SSLayout360: Semi-Supervised Indoor Layout Estimation from 360-Degree
Panorama [0.0]
We propose the first approach to learn representations of room corners and boundaries by using a combination of labeled and unlabeled data.
Our approach can advance layout estimation of complex indoor scenes using as few as 20 labeled examples.
arXiv Detail & Related papers (2021-03-25T09:19:13Z) - Shelf-Supervised Mesh Prediction in the Wild [54.01373263260449]
We propose a learning-based approach to infer 3D shape and pose of object from a single image.
We first infer a volumetric representation in a canonical frame, along with the camera pose.
The coarse volumetric prediction is then converted to a mesh-based representation, which is further refined in the predicted camera frame.
arXiv Detail & Related papers (2021-02-11T18:57:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.