WaferSegClassNet -- A Light-weight Network for Classification and
Segmentation of Semiconductor Wafer Defects
- URL: http://arxiv.org/abs/2207.00960v1
- Date: Sun, 3 Jul 2022 05:46:19 GMT
- Title: WaferSegClassNet -- A Light-weight Network for Classification and
Segmentation of Semiconductor Wafer Defects
- Authors: Subhrajit Nag, Dhruv Makwana, Sai Chandra Teja R, Sparsh Mittal, C
Krishna Mohan
- Abstract summary: We present WaferSegClassNet (WSCN), a novel network based on encoder-decoder architecture.
WSCN performs simultaneous classification and segmentation of both single and mixed-type wafer defects.
We are the first to show segmentation results on the MixedWM38 dataset.
- Score: 3.1806743741013648
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As the integration density and design intricacy of semiconductor wafers
increase, the magnitude and complexity of defects in them are also on the rise.
Since the manual inspection of wafer defects is costly, an automated artificial
intelligence (AI) based computer-vision approach is highly desired. The
previous works on defect analysis have several limitations, such as low
accuracy and the need for separate models for classification and segmentation.
For analyzing mixed-type defects, some previous works require separately
training one model for each defect type, which is non-scalable. In this paper,
we present WaferSegClassNet (WSCN), a novel network based on encoder-decoder
architecture. WSCN performs simultaneous classification and segmentation of
both single and mixed-type wafer defects. WSCN uses a "shared encoder" for
classification, and segmentation, which allows training WSCN end-to-end. We use
N-pair contrastive loss to first pretrain the encoder and then use BCE-Dice
loss for segmentation, and categorical cross-entropy loss for classification.
Use of N-pair contrastive loss helps in better embedding representation in the
latent dimension of wafer maps. WSCN has a model size of only 0.51MB and
performs only 0.2M FLOPS. Thus, it is much lighter than other state-of-the-art
models. Also, it requires only 150 epochs for convergence, compared to 4,000
epochs needed by a previous work. We evaluate our model on the MixedWM38
dataset, which has 38,015 images. WSCN achieves an average classification
accuracy of 98.2% and a dice coefficient of 0.9999. We are the first to show
segmentation results on the MixedWM38 dataset. The source code can be obtained
from https://github.com/ckmvigil/WaferSegClassNet.
Related papers
- Representation Similarity: A Better Guidance of DNN Layer Sharing for Edge Computing without Training [3.792729116385123]
We propose a new model merging scheme by sharing representations at the edge, guided by representation similarity S.
We show that S is extremely highly correlated with merged model's accuracy with Pearson Correlation Coefficient |r| > 0.94 than other metrics.
arXiv Detail & Related papers (2024-10-15T03:35:54Z) - SEMI-CenterNet: A Machine Learning Facilitated Approach for
Semiconductor Defect Inspection [0.10555513406636088]
We have proposed SEMI-CenterNet (SEMI-CN), a customized CN architecture trained on SEM images of semiconductor wafer defects.
SEMI-CN gets trained to output the center, class, size, and offset of a defect instance.
We train SEMI-CN on two datasets and benchmark two ResNet backbones for the framework.
arXiv Detail & Related papers (2023-08-14T14:39:06Z) - Layer-wise Linear Mode Connectivity [52.6945036534469]
Averaging neural network parameters is an intuitive method for the knowledge of two independent models.
It is most prominently used in federated learning.
We analyse the performance of the models that result from averaging single, or groups.
arXiv Detail & Related papers (2023-07-13T09:39:10Z) - Prompt Tuning for Parameter-efficient Medical Image Segmentation [79.09285179181225]
We propose and investigate several contributions to achieve a parameter-efficient but effective adaptation for semantic segmentation on two medical imaging datasets.
We pre-train this architecture with a dedicated dense self-supervision scheme based on assignments to online generated prototypes.
We demonstrate that the resulting neural network model is able to attenuate the gap between fully fine-tuned and parameter-efficiently adapted models.
arXiv Detail & Related papers (2022-11-16T21:55:05Z) - Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification.
The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z) - An Efficient End-to-End Deep Neural Network for Interstitial Lung
Disease Recognition and Classification [0.5424799109837065]
This paper introduces an end-to-end deep convolution neural network (CNN) for classifying ILDs patterns.
The proposed model comprises four convolutional layers with different kernel sizes and Rectified Linear Unit (ReLU) activation function.
A dataset consisting of 21328 image patches of 128 CT scans with five classes is taken to train and assess the proposed model.
arXiv Detail & Related papers (2022-04-21T06:36:10Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - SCA-Net: A Self-Correcting Two-Layer Autoencoder for Hyper-spectral
Unmixing [3.918940900258555]
We show that a two-layer autoencoder (SCA-Net) achieves error metrics that are scales apart ($10-5)$ from previously reported values.
We also show that SCA-Net, based upon a bi-orthogonal representation, performs a self-correction when the the number of endmembers are over-specified.
arXiv Detail & Related papers (2021-02-10T19:37:52Z) - A CNN-LSTM Quantifier for Single Access Point CSI Indoor Localization [9.601632184687787]
This paper proposes a combined network structure between convolutional neural network (CNN) and long-short term memory (LSTM) quantifier for WiFi fingerprinting indoor localization.
Using only a single WiFi router, our structure achieves an average localization error of 2.5m with $mathrm80%$ of the errors under 4m.
arXiv Detail & Related papers (2020-05-13T16:54:31Z) - Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance
Disparity Estimation [51.17232267143098]
We propose a novel system named Disp R-CNN for 3D object detection from stereo images.
We use a statistical shape model to generate dense disparity pseudo-ground-truth without the need of LiDAR point clouds.
Experiments on the KITTI dataset show that, even when LiDAR ground-truth is not available at training time, Disp R-CNN achieves competitive performance and outperforms previous state-of-the-art methods by 20% in terms of average precision.
arXiv Detail & Related papers (2020-04-07T17:48:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.