Task-Specific Data Augmentation and Inference Processing for VIPriors
Instance Segmentation Challenge
- URL: http://arxiv.org/abs/2211.11282v1
- Date: Mon, 21 Nov 2022 09:15:30 GMT
- Title: Task-Specific Data Augmentation and Inference Processing for VIPriors
Instance Segmentation Challenge
- Authors: Bo Yan, Xingran Zhao, Yadong Li, Hongbin Wang
- Abstract summary: We develop task-specific data augmentation strategy and task-specific inference processing strategy.
We demonstrate the applicability of proposed method on VIPriors Instance Challenge.
Experimental results demonstrate that proposed method can achieve a competitive result on the test set of 2022 VIPriors Instance Challenge.
- Score: 9.43662534739698
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Instance segmentation is applied widely in image editing, image analysis and
autonomous driving, etc. However, insufficient data is a common problem in
practical applications. The Visual Inductive Priors(VIPriors) Instance
Segmentation Challenge has focused on this problem. VIPriors for Data-Efficient
Computer Vision Challenges ask competitors to train models from scratch in a
data-deficient setting, but there are some visual inductive priors that can be
used. In order to address the VIPriors instance segmentation problem, we
designed a Task-Specific Data Augmentation(TS-DA) strategy and Inference
Processing(TS-IP) strategy. The main purpose of task-specific data augmentation
strategy is to tackle the data-deficient problem. And in order to make the most
of visual inductive priors, we designed a task-specific inference processing
strategy. We demonstrate the applicability of proposed method on VIPriors
Instance Segmentation Challenge. The segmentation model applied is Hybrid Task
Cascade based detector on the Swin-Base based CBNetV2 backbone. Experimental
results demonstrate that proposed method can achieve a competitive result on
the test set of 2022 VIPriors Instance Segmentation Challenge, with 0.531
AP@0.50:0.95.
Related papers
- MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation [8.727456619750983]
The strategic integration of a visual prior into the training dataset emerges as a potential solution to enhance congruity with the testing data distribution.
Our empirical evaluations underscore the efficacy of MISS, demonstrating commendable performance in scenarios characterized by limited data availability and memory constraints.
arXiv Detail & Related papers (2024-03-18T08:52:23Z) - Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes [7.765333471208582]
In Visual Inductive Priors challenge (VIPriors2023), participants must train a model capable of precisely locating individuals on a basketball court.
We propose memory effIciency inStance framework based on visual inductive prior flow propagation.
Experiments demonstrate our model promising performance even under limited data and memory constraints.
arXiv Detail & Related papers (2024-03-18T08:44:40Z) - Instance Segmentation under Occlusions via Location-aware Copy-Paste
Data Augmentation [8.335108002480068]
MMSports 2023 DeepSportRadar has introduced a dataset that focuses on segmenting human subjects within a basketball context.
This challenge demands the application of robust data augmentation techniques and wisely-chosen deep learning architectures.
Our work (ranked 1st in the competition) first proposes a novel data augmentation technique, capable of generating more training samples with wider distribution.
arXiv Detail & Related papers (2023-10-27T07:44:25Z) - The Second-place Solution for CVPR VISION 23 Challenge Track 1 -- Data
Effificient Defect Detection [3.4853769431047907]
The Vision Challenge Track 1 for Data-Effificient Defect Detection requires competitors to instance segment 14 industrial inspection datasets in a data-defificient setting.
This report introduces the technical details of the team Aoi-overfifitting-Team for this challenge.
arXiv Detail & Related papers (2023-06-25T03:37:02Z) - Causal Scene BERT: Improving object detection by searching for
challenging groups of data [125.40669814080047]
Computer vision applications rely on learning-based perception modules parameterized with neural networks for tasks like object detection.
These modules frequently have low expected error overall but high error on atypical groups of data due to biases inherent in the training process.
Our main contribution is a pseudo-automatic method to discover such groups in foresight by performing causal interventions on simulated scenes.
arXiv Detail & Related papers (2022-02-08T05:14:16Z) - The Second Place Solution for ICCV2021 VIPriors Instance Segmentation
Challenge [6.087398773657721]
The Visual Inductive Priors(VIPriors) for Data-Efficient Computer Vision challenges ask competitors to train models from scratch in a data-deficient setting.
We introduce the technical details of our submission to the ICCV 2021 VIPriors instance segmentation challenge.
Our approach can achieve 40.2%AP@0.50:0.95 on the test set of ICCV 2021 VIPriors instance segmentation challenge.
arXiv Detail & Related papers (2021-12-02T09:23:02Z) - DANCE: DAta-Network Co-optimization for Efficient Segmentation Model
Training and Inference [85.02494022662505]
DANCE is an automated simultaneous data-network co-optimization for efficient segmentation model training and inference.
It integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity.
Experiments and ablating studies demonstrate that DANCE can achieve "all-win" towards efficient segmentation.
arXiv Detail & Related papers (2021-07-16T04:58:58Z) - Large-scale Unsupervised Semantic Segmentation [163.3568726730319]
We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress.
Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 40k high-quality semantic segmentation annotations for evaluation.
arXiv Detail & Related papers (2021-06-06T15:02:11Z) - Unsupervised Semantic Segmentation by Contrasting Object Mask Proposals [78.12377360145078]
We introduce a novel two-step framework that adopts a predetermined prior in a contrastive optimization objective to learn pixel embeddings.
This marks a large deviation from existing works that relied on proxy tasks or end-to-end clustering.
In particular, when fine-tuning the learned representations using just 1% of labeled examples on PASCAL, we outperform supervised ImageNet pre-training by 7.1% mIoU.
arXiv Detail & Related papers (2021-02-11T18:54:47Z) - The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation [93.17367076148348]
We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset.
We unveil that a major cause is the inaccurate classification of object proposals.
We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
arXiv Detail & Related papers (2020-07-23T12:49:07Z) - Naive-Student: Leveraging Semi-Supervised Learning in Video Sequences
for Urban Scene Segmentation [57.68890534164427]
In this work, we ask if we may leverage semi-supervised learning in unlabeled video sequences and extra images to improve the performance on urban scene segmentation.
We simply predict pseudo-labels for the unlabeled data and train subsequent models with both human-annotated and pseudo-labeled data.
Our Naive-Student model, trained with such simple yet effective iterative semi-supervised learning, attains state-of-the-art results at all three Cityscapes benchmarks.
arXiv Detail & Related papers (2020-05-20T18:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.