The Second-place Solution for CVPR VISION 23 Challenge Track 1 -- Data
Effificient Defect Detection
- URL: http://arxiv.org/abs/2306.14116v1
- Date: Sun, 25 Jun 2023 03:37:02 GMT
- Title: The Second-place Solution for CVPR VISION 23 Challenge Track 1 -- Data
Effificient Defect Detection
- Authors: Xian Tao, Zhen Qu, Hengliang Luo, Jianwen Han, Yonghao He, Danfeng
Liu, Chengkan Lv, Fei Shen, Zhengtao Zhang
- Abstract summary: The Vision Challenge Track 1 for Data-Effificient Defect Detection requires competitors to instance segment 14 industrial inspection datasets in a data-defificient setting.
This report introduces the technical details of the team Aoi-overfifitting-Team for this challenge.
- Score: 3.4853769431047907
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Vision Challenge Track 1 for Data-Effificient Defect Detection requires
competitors to instance segment 14 industrial inspection datasets in a
data-defificient setting. This report introduces the technical details of the
team Aoi-overfifitting-Team for this challenge. Our method focuses on the key
problem of segmentation quality of defect masks in scenarios with limited
training samples. Based on the Hybrid Task Cascade (HTC) instance segmentation
algorithm, we connect the transformer backbone (Swin-B) through composite
connections inspired by CBNetv2 to enhance the baseline results. Additionally,
we propose two model ensemble methods to further enhance the segmentation
effect: one incorporates semantic segmentation into instance segmentation,
while the other employs multi-instance segmentation fusion algorithms. Finally,
using multi-scale training and test-time augmentation (TTA), we achieve an
average mAP@0.50:0.95 of more than 48.49% and an average mAR@0.50:0.95 of
66.71% on the test set of the Data Effificient Defect Detection Challenge. The
code is available at https://github.com/love6tao/Aoi-overfitting-team
Related papers
- Segment Every Out-of-Distribution Object [24.495734304922244]
This paper introduces a method to convert anomaly textbfScore textbfTo segmentation textbfMask, called S2M, a simple and effective framework for OoD detection in semantic segmentation.
By transforming anomaly scores into prompts for a promptable segmentation model, S2M eliminates the need for threshold selection.
arXiv Detail & Related papers (2023-11-27T18:20:03Z) - Instance Segmentation under Occlusions via Location-aware Copy-Paste
Data Augmentation [8.335108002480068]
MMSports 2023 DeepSportRadar has introduced a dataset that focuses on segmenting human subjects within a basketball context.
This challenge demands the application of robust data augmentation techniques and wisely-chosen deep learning architectures.
Our work (ranked 1st in the competition) first proposes a novel data augmentation technique, capable of generating more training samples with wider distribution.
arXiv Detail & Related papers (2023-10-27T07:44:25Z) - You Only Look at Once for Real-time and Generic Multi-Task [20.61477620156465]
A-YOLOM is an adaptive, real-time, and lightweight multi-task model.
We develop an end-to-end multi-task model with a unified and streamlined segmentation structure.
We achieve competitive results on the BDD100k dataset.
arXiv Detail & Related papers (2023-10-02T21:09:43Z) - Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations.
In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z) - AcroFOD: An Adaptive Method for Cross-domain Few-shot Object Detection [59.10314662986463]
Cross-domain few-shot object detection aims to adapt object detectors in the target domain with a few annotated target data.
The proposed method achieves state-of-the-art performance on multiple benchmarks.
arXiv Detail & Related papers (2022-09-22T10:23:40Z) - Reliable Shot Identification for Complex Event Detection via
Visual-Semantic Embedding [72.9370352430965]
We propose a visual-semantic guided loss method for event detection in videos.
Motivated by curriculum learning, we introduce a negative elastic regularization term to start training the classifier with instances of high reliability.
An alternative optimization algorithm is developed to solve the proposed challenging non-net regularization problem.
arXiv Detail & Related papers (2021-10-12T11:46:56Z) - Instance Segmentation Challenge Track Technical Report, VIPriors
Workshop at ICCV 2021: Task-Specific Copy-Paste Data Augmentation Method for
Instance Segmentation [0.0]
Copy-Paste has proven to be a very effective data augmentation for instance segmentation.
We applied additional data augmentation techniques including RandAugment and GridMask.
We reached 0.477 AP@0.50:0.95 with the test set by adding the validation set to the training data.
arXiv Detail & Related papers (2021-10-01T15:03:53Z) - Generative Partial Visual-Tactile Fused Object Clustering [81.17645983141773]
We propose a Generative Partial Visual-Tactile Fused (i.e., GPVTF) framework for object clustering.
A conditional cross-modal clustering generative adversarial network is then developed to synthesize one modality conditioning on the other modality.
To the end, two pseudo-label based KL-divergence losses are employed to update the corresponding modality-specific encoders.
arXiv Detail & Related papers (2020-12-28T02:37:03Z) - The Devil is in Classification: A Simple Framework for Long-tail Object
Detection and Instance Segmentation [93.17367076148348]
We investigate performance drop of the state-of-the-art two-stage instance segmentation model Mask R-CNN on the recent long-tail LVIS dataset.
We unveil that a major cause is the inaccurate classification of object proposals.
We propose a simple calibration framework to more effectively alleviate classification head bias with a bi-level class balanced sampling approach.
arXiv Detail & Related papers (2020-07-23T12:49:07Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.