Addressing Class Imbalance and Data Limitations in Advanced Node Semiconductor Defect Inspection: A Generative Approach for SEM Images
- URL: http://arxiv.org/abs/2407.10348v1
- Date: Sun, 14 Jul 2024 22:25:05 GMT
- Title: Addressing Class Imbalance and Data Limitations in Advanced Node Semiconductor Defect Inspection: A Generative Approach for SEM Images
- Authors: Bappaditya Dey, Vic De Ridder, Victor Blanco, Sandip Halder, Bartel Van Waeyenberge,
- Abstract summary: We propose a method for generating synthetic semiconductor SEM images using a diffusion model within a limited data regime.
In contrast to images generated through conventional simulation methods, SEM images generated through our proposed DL method closely resemble real SEM images, replicating their noise characteristics and surface roughness adaptively.
- Score: 0.10555513406636088
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Precision in identifying nanometer-scale device-killer defects is crucial in both semiconductor research and development as well as in production processes. The effectiveness of existing ML-based approaches in this context is largely limited by the scarcity of data, as the production of real semiconductor wafer data for training these models involves high financial and time costs. Moreover, the existing simulation methods fall short of replicating images with identical noise characteristics, surface roughness and stochastic variations at advanced nodes. We propose a method for generating synthetic semiconductor SEM images using a diffusion model within a limited data regime. In contrast to images generated through conventional simulation methods, SEM images generated through our proposed DL method closely resemble real SEM images, replicating their noise characteristics and surface roughness adaptively. Our main contributions, which are validated on three different real semiconductor datasets, are: i) proposing a patch-based generative framework utilizing DDPM to create SEM images with intended defect classes, addressing challenges related to class-imbalance and data insufficiency, ii) demonstrating generated synthetic images closely resemble real SEM images acquired from the tool, preserving all imaging conditions and metrology characteristics without any metadata supervision, iii) demonstrating a defect detector trained on generated defect dataset, either independently or combined with a limited real dataset, can achieve similar or improved performance on real wafer SEM images during validation/testing compared to exclusive training on a real defect dataset, iv) demonstrating the ability of the proposed approach to transfer defect types, critical dimensions, and imaging conditions from one specified CD/Pitch and metrology specifications to another, thereby highlighting its versatility.
Related papers
- Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes [2.8498944632323755]
We propose an end-to-end hybrid architecture for medical image segmentation.
We use Hamiltonian Variational Autoencoders (HVAE) and a discriminative regularization to improve the quality of generated images.
Our architecture operates on a slice-by-slice basis to segment 3D volumes, capitilizing on the richly augmented dataset.
arXiv Detail & Related papers (2024-06-17T15:42:08Z) - SeNM-VAE: Semi-Supervised Noise Modeling with Hierarchical Variational Autoencoder [13.453138169497903]
SeNM-VAE is a semi-supervised noise modeling method that leverages both paired and unpaired datasets to generate realistic degraded data.
We employ our method to generate paired training samples for real-world image denoising and super-resolution tasks.
Our approach excels in the quality of synthetic degraded images compared to other unpaired and paired noise modeling methods.
arXiv Detail & Related papers (2024-03-26T09:03:40Z) - Mitigating Data Consistency Induced Discrepancy in Cascaded Diffusion Models for Sparse-view CT Reconstruction [4.227116189483428]
This study introduces a novel Cascaded Diffusion with Discrepancy Mitigation framework.
It includes the low-quality image generation in latent space and the high-quality image generation in pixel space.
It minimizes computational costs by moving some inference steps from pixel space to latent space.
arXiv Detail & Related papers (2024-03-14T12:58:28Z) - Image Inpainting via Tractable Steering of Diffusion Models [54.13818673257381]
This paper proposes to exploit the ability of Tractable Probabilistic Models (TPMs) to exactly and efficiently compute the constrained posterior.
Specifically, this paper adopts a class of expressive TPMs termed Probabilistic Circuits (PCs)
We show that our approach can consistently improve the overall quality and semantic coherence of inpainted images with only 10% additional computational overhead.
arXiv Detail & Related papers (2023-11-28T21:14:02Z) - On quantifying and improving realism of images generated with diffusion [50.37578424163951]
We propose a metric, called Image Realism Score (IRS), computed from five statistical measures of a given image.
IRS is easily usable as a measure to classify a given image as real or fake.
We experimentally establish the model- and data-agnostic nature of the proposed IRS by successfully detecting fake images generated by Stable Diffusion Model (SDM), Dalle2, Midjourney and BigGAN.
Our efforts have also led to Gen-100 dataset, which provides 1,000 samples for 100 classes generated by four high-quality models.
arXiv Detail & Related papers (2023-09-26T08:32:55Z) - Gradpaint: Gradient-Guided Inpainting with Diffusion Models [71.47496445507862]
Denoising Diffusion Probabilistic Models (DDPMs) have recently achieved remarkable results in conditional and unconditional image generation.
We present GradPaint, which steers the generation towards a globally coherent image.
We generalizes well to diffusion models trained on various datasets, improving upon current state-of-the-art supervised and unsupervised methods.
arXiv Detail & Related papers (2023-09-18T09:36:24Z) - ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic
Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment.
The scarcity of annotated data limits the effectiveness and generalization of existing methods.
We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z) - Data-iterative Optimization Score Model for Stable Ultra-Sparse-View CT
Reconstruction [2.2336243882030025]
We propose an iterative optimization data scoring model (DOSM) for sparse-view CT reconstruction.
DOSM integrates data consistency into its data consistency element, effectively balancing measurement data and generative model constraints.
We leverage conventional techniques to optimize DOSM updates.
arXiv Detail & Related papers (2023-08-28T09:23:18Z) - Automated Semiconductor Defect Inspection in Scanning Electron
Microscope Images: a Systematic Review [4.493547775253646]
Machine learning algorithms can be trained to accurately classify and locate defects in semiconductor samples.
Convolutional neural networks have proved to be particularly useful in this regard.
This systematic review provides a comprehensive overview of the state of automated semiconductor defect inspection on SEM images.
arXiv Detail & Related papers (2023-08-16T13:59:43Z) - Realistic Noise Synthesis with Diffusion Models [68.48859665320828]
Deep image denoising models often rely on large amount of training data for the high quality performance.
We propose a novel method that synthesizes realistic noise using diffusion models, namely Realistic Noise Synthesize Diffusor (RNSD)
RNSD can incorporate guided multiscale content, such as more realistic noise with spatial correlations can be generated at multiple frequencies.
arXiv Detail & Related papers (2023-05-23T12:56:01Z) - Data-driven generation of plausible tissue geometries for realistic
photoacoustic image synthesis [53.65837038435433]
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties.
We propose a novel approach to PAT data simulation, which we refer to as "learning to simulate"
We leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries.
arXiv Detail & Related papers (2021-03-29T11:30:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.