Counting Stacked Objects from Multi-View Images
- URL: http://arxiv.org/abs/2411.19149v1
- Date: Thu, 28 Nov 2024 13:51:16 GMT
- Title: Counting Stacked Objects from Multi-View Images
- Authors: Corentin Dumery, Noa Etté, Jingyi Xu, Aoxiang Fan, Ren Li, Hieu Le, Pascal Fua,
- Abstract summary: We propose a novel 3D counting approach that decomposes the task into two complementary subproblems.
By combining geometric reconstruction and deep learning-based depth analysis, our method can accurately count identical objects within containers.
We validate our 3D Counting pipeline on diverse real-world and large-scale synthetic datasets.
- Score: 57.68870743111393
- License:
- Abstract: Visual object counting is a fundamental computer vision task underpinning numerous real-world applications, from cell counting in biomedicine to traffic and wildlife monitoring. However, existing methods struggle to handle the challenge of stacked 3D objects in which most objects are hidden by those above them. To address this important yet underexplored problem, we propose a novel 3D counting approach that decomposes the task into two complementary subproblems - estimating the 3D geometry of the object stack and the occupancy ratio from multi-view images. By combining geometric reconstruction and deep learning-based depth analysis, our method can accurately count identical objects within containers, even when they are irregularly stacked. We validate our 3D Counting pipeline on diverse real-world and large-scale synthetic datasets, which we will release publicly to facilitate further research.
Related papers
- Monocular 3D Object Detection using Multi-Stage Approaches with
Attention and Slicing aided hyper inference [0.0]
3D object detection is vital as it would enable us to capture objects' sizes, orientation, and position in the world.
We would be able to use this 3D detection in real-world applications such as Augmented Reality (AR), self-driving cars, and robotics.
arXiv Detail & Related papers (2022-12-22T15:36:07Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object
Detection [17.526914782562528]
We propose Graph-DETR3D to automatically aggregate multi-view imagery information through graph structure learning (GSL)
Our best model achieves 49.5 NDS on the nuScenes test leaderboard, achieving new state-of-the-art in comparison with various published image-view 3D object detectors.
arXiv Detail & Related papers (2022-04-25T12:10:34Z) - Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection [70.71934539556916]
We learn geometry-guided depth estimation with projective modeling to advance monocular 3D object detection.
Specifically, a principled geometry formula with projective modeling of 2D and 3D depth predictions in the monocular 3D object detection network is devised.
Our method remarkably improves the detection performance of the state-of-the-art monocular-based method without extra data by 2.80% on the moderate test setting.
arXiv Detail & Related papers (2021-07-29T12:30:39Z) - MonoGRNet: A General Framework for Monocular 3D Object Detection [23.59839921644492]
We propose MonoGRNet for the amodal 3D object detection from a monocular image via geometric reasoning.
MonoGRNet decomposes the monocular 3D object detection task into four sub-tasks including 2D object detection, instance-level depth estimation, projected 3D center estimation and local corner regression.
Experiments are conducted on KITTI, Cityscapes and MS COCO datasets.
arXiv Detail & Related papers (2021-04-18T10:07:52Z) - Counting from Sky: A Large-scale Dataset for Remote Sensing Object
Counting and A Benchmark Method [52.182698295053264]
We are interested in counting dense objects from remote sensing images. Compared with object counting in a natural scene, this task is challenging in the following factors: large scale variation, complex cluttered background, and orientation arbitrariness.
To address these issues, we first construct a large-scale object counting dataset with remote sensing images, which contains four important geographic objects.
We then benchmark the dataset by designing a novel neural network that can generate a density map of an input image.
arXiv Detail & Related papers (2020-08-28T03:47:49Z) - Single View Metrology in the Wild [94.7005246862618]
We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground.
Our method relies on data-driven priors learned by a deep network specifically designed to imbibe weakly supervised constraints from the interplay of the unknown camera with 3D entities such as object heights.
We demonstrate state-of-the-art qualitative and quantitative results on several datasets as well as applications including virtual object insertion.
arXiv Detail & Related papers (2020-07-18T22:31:33Z) - Counting dense objects in remote sensing images [52.182698295053264]
Estimating number of interested objects from a given image is a challenging yet important task.
In this paper, we are interested in counting dense objects from remote sensing images.
To address these issues, we first construct a large-scale object counting dataset based on remote sensing images.
We then benchmark the dataset by designing a novel neural network which can generate density map of an input image.
arXiv Detail & Related papers (2020-02-14T09:13:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.