Data-Efficient Instance Segmentation with a Single GPU
- URL: http://arxiv.org/abs/2110.00242v1
- Date: Fri, 1 Oct 2021 07:36:20 GMT
- Title: Data-Efficient Instance Segmentation with a Single GPU
- Authors: Pengyu Chen, Wanhua Li, Jiwen Lu
- Abstract summary: We introduce a data-efficient segmentation method we used in the 2021 VIPriors Instance Challenge.
Our solution is a modified version of Swin Transformer, based on the mmdetection which is a powerful toolbox.
Our method achieved the AP@0.50:0.95 (medium) of 0.592, which ranks second among all contestants.
- Score: 88.31338435907304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Not everyone is wealthy enough to have hundreds of GPUs or TPUs. Therefore,
we've got to find a way out. In this paper, we introduce a data-efficient
instance segmentation method we used in the 2021 VIPriors Instance Segmentation
Challenge. Our solution is a modified version of Swin Transformer, based on the
mmdetection which is a powerful toolbox. To solve the problem of lack of data,
we utilize data augmentation including random flip and multiscale training to
train our model. During inference, multiscale fusion is used to boost the
performance. We only use a single GPU during the whole training and testing
stages. In the end, our team named THU_IVG_2018 achieved the result of 0.366
for AP@0.50:0.95 on the test set, which is competitive with other top-ranking
methods while only one GPU is used. Besides, our method achieved the
AP@0.50:0.95 (medium) of 0.592, which ranks second among all contestants
Related papers
- The Second-place Solution for CVPR VISION 23 Challenge Track 1 -- Data
Effificient Defect Detection [3.4853769431047907]
The Vision Challenge Track 1 for Data-Effificient Defect Detection requires competitors to instance segment 14 industrial inspection datasets in a data-defificient setting.
This report introduces the technical details of the team Aoi-overfifitting-Team for this challenge.
arXiv Detail & Related papers (2023-06-25T03:37:02Z) - DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP
Training [13.953918004371493]
DisCo-CLIP is a memory-efficient CLIP training approach.
DisCo-CLIP can enable contrastive training of a ViT-B/32 model with a batch size of 32K or 196K.
arXiv Detail & Related papers (2023-04-17T17:58:21Z) - Cramming: Training a Language Model on a Single GPU in One Day [64.18297923419627]
Recent trends in language modeling have focused on increasing performance through scaling.
We investigate the downstream performance achievable with a transformer-based language model trained completely from scratch with masked language modeling for a single day on a single consumer GPU.
We provide evidence that even in this constrained setting, performance closely follows scaling laws observed in large-compute settings.
arXiv Detail & Related papers (2022-12-28T18:59:28Z) - Learning Tracking Representations via Dual-Branch Fully Transformer
Networks [82.21771581817937]
We present a Siamese-like Dual-branch network based on solely Transformers for tracking.
We extract a feature vector for each patch based on its matching results with others within an attention window.
The method achieves better or comparable results as the best-performing methods.
arXiv Detail & Related papers (2021-12-05T13:44:33Z) - Adaptive Elastic Training for Sparse Deep Learning on Heterogeneous
Multi-GPU Servers [65.60007071024629]
We show that Adaptive SGD outperforms four state-of-the-art solutions in time-to-accuracy.
We show experimentally that Adaptive SGD outperforms four state-of-the-art solutions in time-to-accuracy.
arXiv Detail & Related papers (2021-10-13T20:58:15Z) - Instance Segmentation Challenge Track Technical Report, VIPriors
Workshop at ICCV 2021: Task-Specific Copy-Paste Data Augmentation Method for
Instance Segmentation [0.0]
Copy-Paste has proven to be a very effective data augmentation for instance segmentation.
We applied additional data augmentation techniques including RandAugment and GridMask.
We reached 0.477 AP@0.50:0.95 with the test set by adding the validation set to the training data.
arXiv Detail & Related papers (2021-10-01T15:03:53Z) - Efficient Large-Scale Language Model Training on GPU Clusters [19.00915720435389]
Large language models have led to state-of-the-art accuracies across a range of tasks.
Memory capacity is limited, making it impossible to fit large models on a single GPU.
The number of compute operations required to train these models can result in unrealistically long training times.
arXiv Detail & Related papers (2021-04-09T16:43:11Z) - Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU [87.48110331544885]
We propose a novel training methodology to train and scale the existing semantic segmentation models.
We demonstrate a clear benefit of our approach on a dataset with 1284 classes, bootstrapped from LVIS and COCO annotations, with three times better mIoU than the DeeplabV3+ model.
arXiv Detail & Related papers (2020-12-14T13:12:38Z) - Human-Centered Unsupervised Segmentation Fusion [0.0]
We introduce a new segmentation fusion model that is based on K-Modes clustering.
Results obtained from publicly available datasets with human ground truth segmentations clearly show that our model outperforms the state-of-the-art on human segmentations.
arXiv Detail & Related papers (2020-07-22T12:18:31Z) - Kernel methods through the roof: handling billions of points efficiently [94.31450736250918]
Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems.
Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections.
Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware.
arXiv Detail & Related papers (2020-06-18T08:16:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.