CV 3315 Is All You Need : Semantic Segmentation Competition
- URL: http://arxiv.org/abs/2206.12571v1
- Date: Sat, 25 Jun 2022 06:27:57 GMT
- Title: CV 3315 Is All You Need : Semantic Segmentation Competition
- Authors: Akide Liu, Zihan Wang
- Abstract summary: This competition focus on Urban-Sense based on the vehicle camera view.
Class highly unbalanced Urban-Sense images dataset challenge the existing solutions.
Deep Conventional neural network-based semantic segmentation methods become flexible solutions applicable to real-world applications.
- Score: 14.818852884385015
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This competition focus on Urban-Sense Segmentation based on the vehicle
camera view. Class highly unbalanced Urban-Sense images dataset challenge the
existing solutions and further studies. Deep Conventional neural network-based
semantic segmentation methods such as encoder-decoder architecture and
multi-scale and pyramid-based approaches become flexible solutions applicable
to real-world applications. In this competition, we mainly review the
literature and conduct experiments on transformer-driven methods especially
SegFormer, to achieve an optimal trade-off between performance and efficiency.
For example, SegFormer-B0 achieved 74.6% mIoU with the smallest FLOPS, 15.6G,
and the largest model, SegFormer- B5 archived 80.2% mIoU. According to multiple
factors, including individual case failure analysis, individual class
performance, training pressure and efficiency estimation, the final candidate
model for the competition is SegFormer- B2 with 50.6 GFLOPS and 78.5% mIoU
evaluated on the testing set. Checkout our code implementation at
https://vmv.re/cv3315.
Related papers
- Hybrid-Segmentor: A Hybrid Approach to Automated Fine-Grained Crack Segmentation in Civil Infrastructure [52.2025114590481]
We introduce Hybrid-Segmentor, an encoder-decoder based approach that is capable of extracting both fine-grained local and global crack features.
This allows the model to improve its generalization capabilities in distinguish various type of shapes, surfaces and sizes of cracks.
The proposed model outperforms existing benchmark models across 5 quantitative metrics (accuracy 0.971, precision 0.804, recall 0.744, F1-score 0.770, and IoU score 0.630), achieving state-of-the-art status.
arXiv Detail & Related papers (2024-09-04T16:47:16Z) - Technical Report of 2023 ABO Fine-grained Semantic Segmentation
Competition [0.3626013617212667]
We describe the technical details of our submission to the 2023 ABO Fine-grained Semantic Competition, by Team "Zeyu_Dong"
The task is to predicate the semantic labels for the convex gradient of five categories, which consist of high-quality, standardized 3D models of real products available for purchase online.
The appropriate method helps us rank 3rd place in the Dev phase of the 2023 ICCV 3DVeComm Workshop Challenge.
arXiv Detail & Related papers (2023-09-30T16:32:22Z) - 3rd Place Solution for PVUW2023 VSS Track: A Large Model for Semantic
Segmentation on VSPW [68.56017675820897]
In this paper, we introduce 3rd place solution for PVUW2023 VSS track.
We have explored various image-level visual backbones and segmentation heads to tackle the problem of video semantic segmentation.
arXiv Detail & Related papers (2023-06-04T07:50:38Z) - The Second Place Solution for ICCV2021 VIPriors Instance Segmentation
Challenge [6.087398773657721]
The Visual Inductive Priors(VIPriors) for Data-Efficient Computer Vision challenges ask competitors to train models from scratch in a data-deficient setting.
We introduce the technical details of our submission to the ICCV 2021 VIPriors instance segmentation challenge.
Our approach can achieve 40.2%AP@0.50:0.95 on the test set of ICCV 2021 VIPriors instance segmentation challenge.
arXiv Detail & Related papers (2021-12-02T09:23:02Z) - Dynamically pruning segformer for efficient semantic segmentation [8.29672153078638]
We seek to design a lightweight SegFormer for efficient semantic segmentation.
Based on the observation that neurons in SegFormer layers exhibit large variances across different images, we propose a dynamic gated linear layer.
We also introduce two-stage knowledge distillation to transfer the knowledge within the original teacher to the pruned student network.
arXiv Detail & Related papers (2021-11-18T03:34:28Z) - DANCE: DAta-Network Co-optimization for Efficient Segmentation Model
Training and Inference [85.02494022662505]
DANCE is an automated simultaneous data-network co-optimization for efficient segmentation model training and inference.
It integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity.
Experiments and ablating studies demonstrate that DANCE can achieve "all-win" towards efficient segmentation.
arXiv Detail & Related papers (2021-07-16T04:58:58Z) - Deep Gaussian Processes for Few-Shot Segmentation [66.08463078545306]
Few-shot segmentation is a challenging task, requiring the extraction of a generalizable representation from only a few annotated samples.
We propose a few-shot learner formulation based on Gaussian process (GP) regression.
Our approach sets a new state-of-the-art for 5-shot segmentation, with mIoU scores of 68.1 and 49.8 on PASCAL-5i and COCO-20i, respectively.
arXiv Detail & Related papers (2021-03-30T17:56:32Z) - Scaling Semantic Segmentation Beyond 1K Classes on a Single GPU [87.48110331544885]
We propose a novel training methodology to train and scale the existing semantic segmentation models.
We demonstrate a clear benefit of our approach on a dataset with 1284 classes, bootstrapped from LVIS and COCO annotations, with three times better mIoU than the DeeplabV3+ model.
arXiv Detail & Related papers (2020-12-14T13:12:38Z) - Objectness-Aware Few-Shot Semantic Segmentation [31.13009111054977]
We show how to increase overall model capacity to achieve improved performance.
We introduce objectness, which is class-agnostic and so not prone to overfitting.
Given only one annotated example of an unseen category, experiments show that our method outperforms state-of-art methods with respect to mIoU.
arXiv Detail & Related papers (2020-04-06T19:12:08Z) - Learning Fast and Robust Target Models for Video Object Segmentation [83.3382606349118]
Video object segmentation (VOS) is a highly challenging problem since the initial mask, defining the target object, is only given at test-time.
Most previous approaches fine-tune segmentation networks on the first frame, resulting in impractical frame-rates and risk of overfitting.
We propose a novel VOS architecture consisting of two network components.
arXiv Detail & Related papers (2020-02-27T21:58:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.