APRIL-GAN: A Zero-/Few-Shot Anomaly Classification and Segmentation
Method for CVPR 2023 VAND Workshop Challenge Tracks 1&2: 1st Place on
Zero-shot AD and 4th Place on Few-shot AD
- URL: http://arxiv.org/abs/2305.17382v3
- Date: Wed, 11 Oct 2023 07:02:45 GMT
- Title: APRIL-GAN: A Zero-/Few-Shot Anomaly Classification and Segmentation
Method for CVPR 2023 VAND Workshop Challenge Tracks 1&2: 1st Place on
Zero-shot AD and 4th Place on Few-shot AD
- Authors: Xuhai Chen, Yue Han, Jiangning Zhang
- Abstract summary: We present our solution for the Zero/Few-shot Track of the Visual Anomaly and Novelty Detection (VAND) 2023 Challenge.
Our method achieved first place in the zero-shot track, especially excelling in segmentation.
In the few-shot track, we secured the fourth position overall, with our classification F1 score of 0.8687 ranking first among all participating teams.
- Score: 21.493718012180643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this technical report, we briefly introduce our solution for the
Zero/Few-shot Track of the Visual Anomaly and Novelty Detection (VAND) 2023
Challenge. For industrial visual inspection, building a single model that can
be rapidly adapted to numerous categories without or with only a few normal
reference images is a promising research direction. This is primarily because
of the vast variety of the product types. For the zero-shot track, we propose a
solution based on the CLIP model by adding extra linear layers. These layers
are used to map the image features to the joint embedding space, so that they
can compare with the text features to generate the anomaly maps. Besides, when
the reference images are available, we utilize multiple memory banks to store
their features and compare them with the features of the test images during the
testing phase. In this challenge, our method achieved first place in the
zero-shot track, especially excelling in segmentation with an impressive F1
score improvement of 0.0489 over the second-ranked participant. Furthermore, in
the few-shot track, we secured the fourth position overall, with our
classification F1 score of 0.8687 ranking first among all participating teams.
Related papers
- Towards Zero-Shot Camera Trap Image Categorization [0.0]
This paper describes the search for an alternative approach to the automatic categorization of camera trap images.
We benchmark state-of-the-art classifiers using a single model for all images.
Next, we evaluate methods combining MegaDetector with one or more classifiers and Segment Anything to assess their impact on reducing location-specific overfitting.
Last, we propose and test two approaches using large language and foundational models, such as DINOv2, BioCLIP, BLIP, and ChatGPT, in a zero-shot scenario.
arXiv Detail & Related papers (2024-10-16T17:44:58Z) - AIM 2024 Sparse Neural Rendering Challenge: Methods and Results [64.19942455360068]
This paper reviews the challenge on Sparse Neural Rendering that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2024.
The challenge aims at producing novel camera view synthesis of diverse scenes from sparse image observations.
Participants are asked to optimise objective fidelity to the ground-truth images as measured via the Peak Signal-to-Noise Ratio (PSNR) metric.
arXiv Detail & Related papers (2024-09-23T14:17:40Z) - Few-Shot Anomaly Detection via Category-Agnostic Registration Learning [65.64252994254268]
Most existing anomaly detection methods require a dedicated model for each category.
This article proposes a novel few-shot AD (FSAD) framework.
It is the first FSAD method that requires no model fine-tuning for novel categories.
arXiv Detail & Related papers (2024-06-13T05:01:13Z) - MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with
Mutual Scoring of the Unlabeled Images [12.48347948647802]
We study zero-shot anomaly classification (AC) and segmentation (AS) in industrial vision.
We leverage a discriminative characteristic to design a novel zero-shot AC/AS method by Mutual Scoring (MuSc) of the unlabeled images.
We present an optimization approach named Re-scoring with Constrained Image-level Neighborhood (RsCIN) for image-level anomaly classification.
arXiv Detail & Related papers (2024-01-30T05:16:52Z) - Zero-Shot Anomaly Detection with Pre-trained Segmentation Models [2.9322869014189985]
This report outlines our submission to the zero-shot track of the Visual Anomaly and Novelty Detection (VAND) 2023 Challenge.
Building on the performance of the WINCLIP framework, we aim to enhance the system's localization capabilities by integrating zero-shot segmentation models.
Our pipeline requires no external data or information, allowing for it to be directly applied to new datasets.
arXiv Detail & Related papers (2023-06-15T16:43:07Z) - Highly Accurate Dichotomous Image Segmentation [139.79513044546]
A new task called dichotomous image segmentation (DIS) aims to segment highly accurate objects from natural images.
We collect the first large-scale dataset, DIS5K, which contains 5,470 high-resolution (e.g., 2K, 4K or larger) images.
We also introduce a simple intermediate supervision baseline (IS-Net) using both feature-level and mask-level guidance for DIS model training.
arXiv Detail & Related papers (2022-03-06T20:09:19Z) - Self-supervised Image-specific Prototype Exploration for Weakly
Supervised Semantic Segmentation [72.33139350241044]
Weakly Supervised Semantic COCO (WSSS) based on image-level labels has attracted much attention due to low annotation costs.
We propose a Self-supervised Image-specific Prototype Exploration (SIPE) that consists of an Image-specific Prototype Exploration (IPE) and a General-Specific Consistency (GSC) loss.
Our SIPE achieves new state-of-the-art performance using only image-level labels.
arXiv Detail & Related papers (2022-03-06T09:01:03Z) - A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained
Vision-language Model [61.58071099082296]
It is unclear how to make zero-shot recognition working well on broader vision problems, such as object detection and semantic segmentation.
In this paper, we target for zero-shot semantic segmentation, by building it on an off-the-shelf pre-trained vision-language model, i.e., CLIP.
Our experimental results show that this simple framework surpasses previous state-of-the-arts by a large margin.
arXiv Detail & Related papers (2021-12-29T18:56:18Z) - Rail-5k: a Real-World Dataset for Rail Surface Defects Detection [10.387206647221626]
This paper presents the Rail-5k dataset for benchmarking the performance of visual algorithms in a real-world application scenario.
We collected over 5k high-quality images from railways across China, and annotated 1100 images with the help from railway experts to identify the most common 13 types of rail defects.
arXiv Detail & Related papers (2021-06-28T01:53:52Z) - SCNet: Enhancing Few-Shot Semantic Segmentation by Self-Contrastive
Background Prototypes [56.387647750094466]
Few-shot semantic segmentation aims to segment novel-class objects in a query image with only a few annotated examples.
Most of advanced solutions exploit a metric learning framework that performs segmentation through matching each pixel to a learned foreground prototype.
This framework suffers from biased classification due to incomplete construction of sample pairs with the foreground prototype only.
arXiv Detail & Related papers (2021-04-19T11:21:47Z) - Revisiting the Sibling Head in Object Detector [24.784483589579896]
This paper provides the observation that the spatial misalignment between the two object functions in the sibling head can considerably hurt the training process.
Considering the classification and regression, TSD decouples them from the spatial dimension by generating two disentangled proposals for them.
Surprisingly, this simple design can boost all backbones and models on both MS COCO and Google OpenImage consistently by 3% mAP.
arXiv Detail & Related papers (2020-03-17T05:21:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.