Related papers: Split, Merge, and Refine: Fitting Tight Bounding Boxes via Over-Segmentation and Iterative Search

Split, Merge, and Refine: Fitting Tight Bounding Boxes via Over-Segmentation and Iterative Search

URL: http://arxiv.org/abs/2304.04336v3
Date: Fri, 1 Dec 2023 14:07:01 GMT
Title: Split, Merge, and Refine: Fitting Tight Bounding Boxes via Over-Segmentation and Iterative Search
Authors: Chanhyeok Park, Minhyuk Sung
Abstract summary: We propose a novel framework for finding a set of tight bounding boxes of a 3D shape via over-segmentation and iterative merging and refinement. By thoughtful evaluation, we demonstrate full coverage, tightness, and an adequate number of bounding boxes of our method without requiring any training data or supervision.
Score: 15.29167642670379
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Achieving tight bounding boxes of a shape while guaranteeing complete boundness is an essential task for efficient geometric operations and unsupervised semantic part detection. But previous methods fail to achieve both full coverage and tightness. Neural-network-based methods are not suitable for these goals due to the non-differentiability of the objective, while classic iterative search methods suffer from their sensitivity to the initialization. We propose a novel framework for finding a set of tight bounding boxes of a 3D shape via over-segmentation and iterative merging and refinement. Our result shows that utilizing effective search methods with appropriate objectives is the key to producing bounding boxes with both properties. We employ an existing pre-segmentation to split the shape and obtain over-segmentation. Then, we apply hierarchical merging with our novel tightness-aware merging and stopping criteria. To overcome the sensitivity to the initialization, we also define actions to refine the bounding box parameters in an Markov Decision Process (MDP) setup with a soft reward function promoting a wider exploration. Lastly, we further improve the refinement step with Monte Carlo Tree Search (MCTS) based multi-action space exploration. By thoughtful evaluation on diverse 3D shapes, we demonstrate full coverage, tightness, and an adequate number of bounding boxes of our method without requiring any training data or supervision. It thus can be applied to various downstream tasks in computer vision and graphics.

Related papers

P2Object: Single Point Supervised Object Detection and Instance Segmentation [58.778288785355]
We introduce Point-to-Box Network (P2BNet), which constructs balanced textbftextitinstance-level proposal bags P2MNet can generate more precise bounding boxes and generalize to segmentation tasks. Our method largely surpasses the previous methods in terms of the mean average precision on COCO, VOC, and Cityscapes.
arXiv Detail & Related papers (2025-04-10T14:51:08Z)
A Deep Learning Framework for Boundary-Aware Semantic Segmentation [9.680285420002516]
This study proposes a Mask2Former-based semantic segmentation algorithm incorporating a boundary enhancement feature bridging module (BEFBM) The proposed approach achieves significant improvements in metrics such as mIOU, mDICE, and mRecall. Visual analysis confirms the model's advantages in fine-grained regions.
arXiv Detail & Related papers (2025-03-28T00:00:08Z)
Lidar Panoptic Segmentation and Tracking without Bells and Whistles [48.078270195629415]
We propose a detection-centric network for lidar segmentation and tracking. One of the core components of our network is the object instance detection branch. We evaluate our method on several 3D/4D LPS benchmarks and observe that our model establishes a new state-of-the-art among open-sourced models.
arXiv Detail & Related papers (2023-10-19T04:44:43Z)
3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds. Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z)
Unsupervised Space Partitioning for Nearest Neighbor Search [6.516813715425121]
We propose an end-to-end learning framework that couples the partitioning and learning-to-search steps using a custom loss function. A key advantage of our proposed solution is that it does not require any expensive pre-processing of the dataset. We show that our method beats the state-of-the-art space partitioning method and the ubiquitous K-means clustering method.
arXiv Detail & Related papers (2022-06-16T11:17:03Z)
PointInst3D: Segmenting 3D Instances by Points [136.7261709896713]
We propose a fully-convolutional 3D point cloud instance segmentation method that works in a per-point prediction fashion. We find the key to its success is assigning a suitable target to each sampled point. Our approach achieves promising results on both ScanNet and S3DIS benchmarks.
arXiv Detail & Related papers (2022-04-25T02:41:46Z)
Object-Guided Instance Segmentation With Auxiliary Feature Refinement for Biological Images [58.914034295184685]
Instance segmentation is of great importance for many biological applications, such as study of neural cell interactions, plant phenotyping, and quantitatively measuring how cells react to drug treatment. Box-based instance segmentation methods capture objects via bounding boxes and then perform individual segmentation within each bounding box region. Our method first detects the center points of the objects, from which the bounding box parameters are then predicted. The segmentation branch reuses the object features as guidance to separate target object from the neighboring ones within the same bounding box region.
arXiv Detail & Related papers (2021-06-14T04:35:36Z)
Learning Salient Boundary Feature for Anchor-free Temporal Action Localization [81.55295042558409]
Temporal action localization is an important yet challenging task in video understanding. We propose the first purely anchor-free temporal localization method. Our model includes (i) an end-to-end trainable basic predictor, (ii) a saliency-based refinement module, and (iii) several consistency constraints.
arXiv Detail & Related papers (2021-03-24T12:28:32Z)
Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images [24.216869988183092]
We propose a shapeaware semi-supervised segmentation strategy to leverage abundant unlabeled data and to enforce a geometric shape constraint on the segmentation output. We develop a multi-task deep network that jointly predicts semantic segmentation and signed distance mapDM) of object surfaces. Experiments show that our method outperforms current state-of-the-art approaches with improved shape estimation.
arXiv Detail & Related papers (2020-07-21T11:44:52Z)
Dive Deeper Into Box for Object Detection [49.923586776690115]
We propose a box reorganization method(DDBNet), which can dive deeper into the box for more accurate localization. Experimental results show that our method is effective which leads to state-of-the-art performance for object detection.
arXiv Detail & Related papers (2020-07-15T07:49:05Z)
1st Place Solutions for OpenImage2019 -- Object Detection and Instance Segmentation [116.25081559037872]
This article introduces the solutions of the two champion teams, MMfruit' for the detection track and MMfruitSeg' for the segmentation track, in OpenImage Challenge 2019. It is commonly known that for an object detector, the shared feature at the end of the backbone is not appropriate for both classification and regression. We propose the Decoupling Head (DH) to disentangle the object classification and regression via the self-learned optimal feature extraction.
arXiv Detail & Related papers (2020-03-17T06:45:07Z)
Towards Bounding-Box Free Panoptic Segmentation [16.4548904544277]
We introduce a new Bounding-Box Free Network (BBFNet) for panoptic segmentation. BBFNet predicts coarse watershed levels and uses them to detect large instance candidates where boundaries are well defined. For smaller instances, whose boundaries are less reliable, BBFNet also predicts instance centers by means of Hough voting followed by mean-shift to reliably detect small objects.
arXiv Detail & Related papers (2020-02-18T16:34:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.