Fast and Accurate Unknown Object Instance Segmentation through Error-Informed Refinement
- URL: http://arxiv.org/abs/2306.16132v2
- Date: Tue, 30 Apr 2024 14:37:59 GMT
- Title: Fast and Accurate Unknown Object Instance Segmentation through Error-Informed Refinement
- Authors: Seunghyeok Back, Sangbeom Lee, Kangmin Kim, Joosoon Lee, Sungho Shin, Jemo Maeng, Kyoobin Lee,
- Abstract summary: INSTA-BEER is a fast and accurate model-agnostic refinement method that enhances the performance of unknown object instance segmentation.
We introduce the quad-metric boundary error, which quantifies pixel-wise true positives, true negatives, false positives, and false negatives at the boundaries of object instances.
In comprehensive evaluations conducted on three widely used benchmark datasets, INSTA-BEER outperformed state-of-the-art models in both accuracy and inference time.
- Score: 7.297340899783621
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Accurate perception of unknown objects is essential for autonomous robots, particularly when manipulating novel items in unstructured environments. However, existing unknown object instance segmentation (UOIS) methods often have over-segmentation and under-segmentation problems, resulting in inaccurate instance boundaries and failures in subsequent robotic tasks such as grasping and placement. To address this challenge, this article introduces INSTA-BEER, a fast and accurate model-agnostic refinement method that enhances the UOIS performance. The model adopts an error-informed refinement approach, which first predicts pixel-wise errors in the initial segmentation and then refines the segmentation guided by these error estimates. We introduce the quad-metric boundary error, which quantifies pixel-wise true positives, true negatives, false positives, and false negatives at the boundaries of object instances, effectively capturing both fine-grained and instance-level segmentation errors. Additionally, the Error Guidance Fusion (EGF) module explicitly integrates error information into the refinement process, further improving segmentation quality. In comprehensive evaluations conducted on three widely used benchmark datasets, INSTA-BEER outperformed state-of-the-art models in both accuracy and inference time. Moreover, a real-world robotic experiment demonstrated the practical applicability of our method in improving the performance of target object grasping tasks in cluttered environments.
Related papers
- Test Time Training for Industrial Anomaly Segmentation [15.973768095014906]
Anomaly Detection and Ranging (AD&S) is crucial for industrial quality control.
This paper proposes a test time training strategy to improve the segmentation performance.
We demonstrate the effectiveness of our approach over baselines through extensive experimentation and evaluation on MVTec AD and MVTec 3D-AD.
arXiv Detail & Related papers (2024-04-04T18:31:24Z) - Weakly-Supervised Cross-Domain Segmentation of Electron Microscopy with Sparse Point Annotation [1.124958340749622]
We introduce a multitask learning framework to leverage correlations among the counting, detection, and segmentation tasks.
We develop a cross-position cut-and-paste for label augmentation and an entropy-based pseudo-label selection.
The proposed model is capable of significantly outperforming UDA methods and produces comparable performance as the supervised counterpart.
arXiv Detail & Related papers (2024-03-31T12:22:23Z) - RISeg: Robot Interactive Object Segmentation via Body Frame-Invariant
Features [6.358423536732677]
We introduce a novel approach to correct inaccurate segmentation by using robot interaction and a designed body frame-invariant feature.
We demonstrate the effectiveness of our proposed interactive perception pipeline in accurately segmenting cluttered scenes by achieving an average object segmentation accuracy rate of 80.7%.
arXiv Detail & Related papers (2024-03-04T05:03:24Z) - For A More Comprehensive Evaluation of 6DoF Object Pose Tracking [22.696375341994035]
We contribute a unified benchmark to address the above problems.
For more accurate annotation of YCBV, we propose a multi-view multi-object global pose refinement method.
In experiments, we validate the precision and reliability of the proposed global pose refinement method with a realistic semi-synthesized dataset.
arXiv Detail & Related papers (2023-09-14T15:35:08Z) - Distributional Instance Segmentation: Modeling Uncertainty and High
Confidence Predictions with Latent-MaskRCNN [77.0623472106488]
In this paper, we explore a class of distributional instance segmentation models using latent codes.
For robotic picking applications, we propose a confidence mask method to achieve the high precision necessary.
We show that our method can significantly reduce critical errors in robotic systems, including our newly released dataset of ambiguous scenes.
arXiv Detail & Related papers (2023-05-03T05:57:29Z) - Absolute Wrong Makes Better: Boosting Weakly Supervised Object Detection
via Negative Deterministic Information [54.35679298764169]
Weakly supervised object detection (WSOD) is a challenging task, in which image-level labels are used to train an object detector.
This paper focuses on identifying and fully exploiting the deterministic information in WSOD.
We propose a negative deterministic information (NDI) based method for improving WSOD, namely NDI-WSOD.
arXiv Detail & Related papers (2022-04-21T12:55:27Z) - Unseen Object Instance Segmentation with Fully Test-time RGB-D
Embeddings Adaptation [14.258456366985444]
Recently, a popular solution is leveraging RGB-D features of large-scale synthetic data and applying the model to unseen real-world scenarios.
We re-emphasize the adaptation process across Sim2Real domains in this paper.
We propose a framework to conduct the Fully Test-time RGB-D Embeddings Adaptation (FTEA) based on parameters of the BatchNorm layer.
arXiv Detail & Related papers (2022-04-21T02:35:20Z) - SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation [111.61261419566908]
Deep neural networks (DNNs) are usually trained on a closed set of semantic classes.
They are ill-equipped to handle previously-unseen objects.
detecting and localizing such objects is crucial for safety-critical applications such as perception for automated driving.
arXiv Detail & Related papers (2021-04-30T07:58:19Z) - Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing
Simulation-to-Real Domain Shift in LiDAR Bird's Eye View [110.83289076967895]
We present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process.
The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark.
arXiv Detail & Related papers (2021-04-22T12:47:37Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images.
Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.