COCO-OLAC: A Benchmark for Occluded Panoptic Segmentation and Image Understanding
- URL: http://arxiv.org/abs/2409.12760v2
- Date: Sun, 12 Jan 2025 11:44:09 GMT
- Title: COCO-OLAC: A Benchmark for Occluded Panoptic Segmentation and Image Understanding
- Authors: Wenbo Wei, Jun Wang, Abhir Bhalerao,
- Abstract summary: This paper proposes a new large-scale dataset named COCO-OLAC (COCO Occlusion Labels for All Computer Vision Tasks)
COCO-OLAC is derived from the COCO dataset by manually labelling images into three perceived occlusion levels.
We demonstrate that the proposed approach boosts the performance of the baseline model and achieves SOTA performance on the proposed COCO-OLAC dataset.
- Score: 8.261771972240778
- License:
- Abstract: To help address the occlusion problem in panoptic segmentation and image understanding, this paper proposes a new large-scale dataset named COCO-OLAC (COCO Occlusion Labels for All Computer Vision Tasks), which is derived from the COCO dataset by manually labelling images into three perceived occlusion levels. Using COCO-OLAC, we systematically assess and quantify the impact of occlusion on panoptic segmentation on samples having different levels of occlusion. Comparative experiments with SOTA panoptic models demonstrate that the presence of occlusion significantly affects performance, with higher occlusion levels resulting in notably poorer performance. Additionally, we propose a straightforward yet effective method as an initial attempt to leverage the occlusion annotation using contrastive learning to render a model that learns a more robust representation capturing different severities of occlusion. Experimental results demonstrate that the proposed approach boosts the performance of the baseline model and achieves SOTA performance on the proposed COCO-OLAC dataset.
Related papers
- CMU-Flownet: Exploring Point Cloud Scene Flow Estimation in Occluded Scenario [10.852258389804984]
Occlusions hinder point cloud frame alignment in LiDAR data, a challenge inadequately addressed by scene flow models.
We introduce the Correlation Matrix Upsampling Flownet (CMU-Flownet), incorporating an occlusion estimation module within its cost volume layer.
CMU-Flownet establishes state-of-the-art performance within the realms of occluded Flyingthings3D and KITTY datasets.
arXiv Detail & Related papers (2024-04-16T13:47:21Z) - Discrepancy-based Active Learning for Weakly Supervised Bleeding
Segmentation in Wireless Capsule Endoscopy Images [36.39723547760312]
This paper proposes a new Discrepancy-basEd Active Learning approach to bridge the gap between CAMs and ground truths with a few annotations.
Specifically, to liberate labor, we design a novel discrepancy decoder and a CAMPUS criterion to replace the noisy CAMs with accurate model predictions and a few human labels.
Our method outperforms the state-of-the-art active learning methods and reaches comparable performance to those trained with full annotated datasets with only 10% of the training data labeled.
arXiv Detail & Related papers (2023-08-09T15:04:17Z) - Linking data separation, visual separation, and classifier performance
using pseudo-labeling by contrastive learning [125.99533416395765]
We argue that the performance of the final classifier depends on the data separation present in the latent space and visual separation present in the projection.
We demonstrate our results by the classification of five real-world challenging image datasets of human intestinal parasites with only 1% supervised samples.
arXiv Detail & Related papers (2023-02-06T10:01:38Z) - Rethinking Semi-Supervised Medical Image Segmentation: A
Variance-Reduction Perspective [51.70661197256033]
We propose ARCO, a semi-supervised contrastive learning framework with stratified group theory for medical image segmentation.
We first propose building ARCO through the concept of variance-reduced estimation and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks.
We experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings.
arXiv Detail & Related papers (2023-02-03T13:50:25Z) - RCPS: Rectified Contrastive Pseudo Supervision for Semi-Supervised
Medical Image Segmentation [26.933651788004475]
We propose a novel semi-supervised segmentation method named Rectified Contrastive Pseudo Supervision (RCPS)
RCPS combines a rectified pseudo supervision and voxel-level contrastive learning to improve the effectiveness of semi-supervised segmentation.
Experimental results reveal that the proposed method yields better segmentation performance compared with the state-of-the-art methods in semi-supervised medical image segmentation.
arXiv Detail & Related papers (2023-01-13T12:03:58Z) - Holistic Guidance for Occluded Person Re-Identification [7.662745552551165]
In real-world video surveillance applications, person re-identification (ReID) suffers from the effects of occlusions and detection errors.
We introduce a novel Holistic Guidance (HG) method that relies only on person identity labels.
Our proposed student-teacher framework is trained to address the problem by matching the distributions of between- and within-class distances (DCDs) of occluded samples with that of holistic (non-occluded) samples.
In addition to this, a joint generative-discriminative backbone is trained with a denoising autoencoder, allowing the system to
arXiv Detail & Related papers (2021-04-13T21:50:29Z) - Towards Unbiased COVID-19 Lesion Localisation and Segmentation via
Weakly Supervised Learning [66.36706284671291]
We propose a data-driven framework supervised by only image-level labels to support unbiased lesion localisation.
The framework can explicitly separate potential lesions from original images, with the help of a generative adversarial network and a lesion-specific decoder.
arXiv Detail & Related papers (2021-03-01T06:05:49Z) - Gradient-Induced Co-Saliency Detection [81.54194063218216]
Co-saliency detection (Co-SOD) aims to segment the common salient foreground in a group of relevant images.
In this paper, inspired by human behavior, we propose a gradient-induced co-saliency detection method.
arXiv Detail & Related papers (2020-04-28T08:40:55Z) - Peeking into occluded joints: A novel framework for crowd pose
estimation [88.56203133287865]
OPEC-Net is an Image-Guided Progressive GCN module that estimates invisible joints from an inference perspective.
OCPose is the most complex Occluded Pose dataset with respect to average IoU between adjacent instances.
arXiv Detail & Related papers (2020-03-23T19:32:40Z) - Towards High Performance Human Keypoint Detection [87.1034745775229]
We find that context information plays an important role in reasoning human body configuration and invisible keypoints.
Inspired by this, we propose a cascaded context mixer ( CCM) which efficiently integrates spatial and channel context information.
To maximize CCM's representation capability, we develop a hard-negative person detection mining strategy and a joint-training strategy.
We present several sub-pixel refinement techniques for postprocessing keypoint predictions to improve detection accuracy.
arXiv Detail & Related papers (2020-02-03T02:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.