Related papers: Instance Segmentation under Occlusions via Location-aware Copy-Paste Data Augmentation

Instance Segmentation under Occlusions via Location-aware Copy-Paste Data Augmentation

URL: http://arxiv.org/abs/2310.17949v2
Date: Tue, 21 Nov 2023 05:55:10 GMT
Title: Instance Segmentation under Occlusions via Location-aware Copy-Paste Data Augmentation
Authors: Son Nguyen, Mikel Lainsa, Hung Dao, Daeyoung Kim, Giang Nguyen
Abstract summary: MMSports 2023 DeepSportRadar has introduced a dataset that focuses on segmenting human subjects within a basketball context. This challenge demands the application of robust data augmentation techniques and wisely-chosen deep learning architectures. Our work (ranked 1st in the competition) first proposes a novel data augmentation technique, capable of generating more training samples with wider distribution.
Score: 8.335108002480068
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Occlusion is a long-standing problem in computer vision, particularly in instance segmentation. ACM MMSports 2023 DeepSportRadar has introduced a dataset that focuses on segmenting human subjects within a basketball context and a specialized evaluation metric for occlusion scenarios. Given the modest size of the dataset and the highly deformable nature of the objects to be segmented, this challenge demands the application of robust data augmentation techniques and wisely-chosen deep learning architectures. Our work (ranked 1st in the competition) first proposes a novel data augmentation technique, capable of generating more training samples with wider distribution. Then, we adopt a new architecture - Hybrid Task Cascade (HTC) framework with CBNetV2 as backbone and MaskIoU head to improve segmentation performance. Furthermore, we employ a Stochastic Weight Averaging (SWA) training strategy to improve the model's generalization. As a result, we achieve a remarkable occlusion score (OM) of 0.533 on the challenge dataset, securing the top-1 position on the leaderboard. Source code is available at this https://github.com/nguyendinhson-kaist/MMSports23-Seg-AutoID.

Related papers

Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments. We propose UOIS-SAM, a data-efficient solution for the UOIS task. UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z)
AMAES: Augmented Masked Autoencoder Pretraining on Public Brain MRI Data for 3D-Native Segmentation [2.0749231618270803]
This study investigates the impact of self-supervised pretraining of 3D semantic segmentation models on a large-scale, domain-specific dataset. We introduce BRAINS-45K, a dataset of 44,756 brain MRI volumes from public sources, the largest public dataset available.
arXiv Detail & Related papers (2024-08-01T15:27:48Z)
LiteNeXt: A Novel Lightweight ConvMixer-based Model with Self-embedding Representation Parallel for Medical Image Segmentation [2.0901574458380403]
We propose a new lightweight but efficient model, namely LiteNeXt, for medical image segmentation. LiteNeXt is trained from scratch with small amount of parameters (0.71M) and Giga Floating Point Operations Per Second (0.42).
arXiv Detail & Related papers (2024-04-04T01:59:19Z)
A Lightweight Clustering Framework for Unsupervised Semantic Segmentation [28.907274978550493]
Unsupervised semantic segmentation aims to categorize each pixel in an image into a corresponding class without the use of annotated data. We propose a lightweight clustering framework for unsupervised semantic segmentation. Our framework achieves state-of-the-art results on PASCAL VOC and MS COCO datasets.
arXiv Detail & Related papers (2023-11-30T15:33:42Z)
The Second-place Solution for CVPR VISION 23 Challenge Track 1 -- Data Effificient Defect Detection [3.4853769431047907]
The Vision Challenge Track 1 for Data-Effificient Defect Detection requires competitors to instance segment 14 industrial inspection datasets in a data-defificient setting. This report introduces the technical details of the team Aoi-overfifitting-Team for this challenge.
arXiv Detail & Related papers (2023-06-25T03:37:02Z)
Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage. We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets. By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z)
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains. We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images. A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z)
Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation [94.16816278191477]
We present a framework for semi-adaptive and domain-supervised semantic segmentation. It is enhanced by self-supervised monocular depth estimation trained only on unlabeled image sequences. We validate the proposed model on the Cityscapes dataset.
arXiv Detail & Related papers (2021-08-28T01:33:38Z)
1st Place Solutions for OpenImage2019 -- Object Detection and Instance Segmentation [116.25081559037872]
This article introduces the solutions of the two champion teams, MMfruit' for the detection track and MMfruitSeg' for the segmentation track, in OpenImage Challenge 2019. It is commonly known that for an object detector, the shared feature at the end of the backbone is not appropriate for both classification and regression. We propose the Decoupling Head (DH) to disentangle the object classification and regression via the self-learned optimal feature extraction.
arXiv Detail & Related papers (2020-03-17T06:45:07Z)
Learning Fast and Robust Target Models for Video Object Segmentation [83.3382606349118]
Video object segmentation (VOS) is a highly challenging problem since the initial mask, defining the target object, is only given at test-time. Most previous approaches fine-tune segmentation networks on the first frame, resulting in impractical frame-rates and risk of overfitting. We propose a novel VOS architecture consisting of two network components.
arXiv Detail & Related papers (2020-02-27T21:58:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.