Related papers: Counting Stacked Objects from Multi-View Images

Counting Stacked Objects from Multi-View Images

URL: http://arxiv.org/abs/2411.19149v1
Date: Thu, 28 Nov 2024 13:51:16 GMT
Title: Counting Stacked Objects from Multi-View Images
Authors: Corentin Dumery, Noa Etté, Jingyi Xu, Aoxiang Fan, Ren Li, Hieu Le, Pascal Fua,
Abstract summary: We propose a novel 3D counting approach that decomposes the task into two complementary subproblems.<n>By combining geometric reconstruction and deep learning-based depth analysis, our method can accurately count identical objects within containers.<n>We validate our 3D Counting pipeline on diverse real-world and large-scale synthetic datasets.
Score: 57.68870743111393
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual object counting is a fundamental computer vision task underpinning numerous real-world applications, from cell counting in biomedicine to traffic and wildlife monitoring. However, existing methods struggle to handle the challenge of stacked 3D objects in which most objects are hidden by those above them. To address this important yet underexplored problem, we propose a novel 3D counting approach that decomposes the task into two complementary subproblems - estimating the 3D geometry of the object stack and the occupancy ratio from multi-view images. By combining geometric reconstruction and deep learning-based depth analysis, our method can accurately count identical objects within containers, even when they are irregularly stacked. We validate our 3D Counting pipeline on diverse real-world and large-scale synthetic datasets, which we will release publicly to facilitate further research.

Related papers

Object-X: Learning to Reconstruct Multi-Modal 3D Object Representations [112.29763628638112]
Object-X is a versatile multi-modal 3D representation framework.<n>It can encoding rich object embeddings and decoding them back into geometric and visual reconstructions.<n>It supports a range of downstream tasks, including scene alignment, single-image 3D object reconstruction, and localization.
arXiv Detail & Related papers (2025-06-05T09:14:42Z)
Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection [54.78470057491049]
Occupancy has emerged as a promising alternative for 3D scene perception. We introduce object-centric occupancy as a supplement to object bboxes. We show that our occupancy features significantly enhance the detection results of state-of-the-art 3D object detectors.
arXiv Detail & Related papers (2024-12-06T16:12:38Z)
Zero-Shot Multi-Object Scene Completion [59.325611678171974]
We present a 3D scene completion method that recovers the complete geometry of multiple unseen objects in complex scenes from a single RGB-D image. Our method outperforms the current state-of-the-art on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-03-21T17:59:59Z)
Monocular 3D Object Detection using Multi-Stage Approaches with Attention and Slicing aided hyper inference [0.0]
3D object detection is vital as it would enable us to capture objects' sizes, orientation, and position in the world. We would be able to use this 3D detection in real-world applications such as Augmented Reality (AR), self-driving cars, and robotics.
arXiv Detail & Related papers (2022-12-22T15:36:07Z)
CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework. Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene. In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z)
Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection [17.526914782562528]
We propose Graph-DETR3D to automatically aggregate multi-view imagery information through graph structure learning (GSL) Our best model achieves 49.5 NDS on the nuScenes test leaderboard, achieving new state-of-the-art in comparison with various published image-view 3D object detectors.
arXiv Detail & Related papers (2022-04-25T12:10:34Z)
Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection. Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised. Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z)
MonoGRNet: A General Framework for Monocular 3D Object Detection [23.59839921644492]
We propose MonoGRNet for the amodal 3D object detection from a monocular image via geometric reasoning. MonoGRNet decomposes the monocular 3D object detection task into four sub-tasks including 2D object detection, instance-level depth estimation, projected 3D center estimation and local corner regression. Experiments are conducted on KITTI, Cityscapes and MS COCO datasets.
arXiv Detail & Related papers (2021-04-18T10:07:52Z)
Counting from Sky: A Large-scale Dataset for Remote Sensing Object Counting and A Benchmark Method [52.182698295053264]
We are interested in counting dense objects from remote sensing images. Compared with object counting in a natural scene, this task is challenging in the following factors: large scale variation, complex cluttered background, and orientation arbitrariness. To address these issues, we first construct a large-scale object counting dataset with remote sensing images, which contains four important geographic objects. We then benchmark the dataset by designing a novel neural network that can generate a density map of an input image.
arXiv Detail & Related papers (2020-08-28T03:47:49Z)
Counting dense objects in remote sensing images [52.182698295053264]
Estimating number of interested objects from a given image is a challenging yet important task. In this paper, we are interested in counting dense objects from remote sensing images. To address these issues, we first construct a large-scale object counting dataset based on remote sensing images. We then benchmark the dataset by designing a novel neural network which can generate density map of an input image.
arXiv Detail & Related papers (2020-02-14T09:13:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.