A Physics-Constrained, Design-Driven Methodology for Defect Dataset Generation in Optical Lithography
- URL: http://arxiv.org/abs/2512.09001v1
- Date: Tue, 09 Dec 2025 06:13:33 GMT
- Title: A Physics-Constrained, Design-Driven Methodology for Defect Dataset Generation in Optical Lithography
- Authors: Yuehua Hu, Jiyeong Kong, Dong-yeol Shin, Jaekyun Kim, Kyung-Tae Kang,
- Abstract summary: This study proposes a novel methodology for generating large-scale, physically valid defect datasets with pixel-level annotations.<n>We constructed a comprehensive dataset of 3,530 Optical micrographs containing 13,365 annotated defect instances.
- Score: 1.0610015128259989
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The efficacy of Artificial Intelligence (AI) in micro/nano manufacturing is fundamentally constrained by the scarcity of high-quality and physically grounded training data for defect inspection. Lithography defect data from semiconductor industry are rarely accessible for research use, resulting in a shortage of publicly available datasets. To address this bottleneck in lithography, this study proposes a novel methodology for generating large-scale, physically valid defect datasets with pixel-level annotations. The framework begins with the ab initio synthesis of defect layouts using controllable, physics-constrained mathematical morphology operations (erosion and dilation) applied to the original design-level layout. These synthesized layouts, together with their defect-free counterparts, are fabricated into physical samples via high-fidelity digital micromirror device (DMD)-based lithography. Optical micrographs of the synthesized defect samples and their defect-free references are then compared to create consistent defect delineation annotations. Using this methodology, we constructed a comprehensive dataset of 3,530 Optical micrographs containing 13,365 annotated defect instances including four classes: bridge, burr, pinch, and contamination. Each defect instance is annotated with a pixel-accurate segmentation mask, preserving full contour and geometry. The segmentation-based Mask R-CNN achieves AP@0.5 of 0.980, 0.965, and 0.971, compared with 0.740, 0.719, and 0.717 for Faster R-CNN on bridge, burr, and pinch classes, representing a mean AP@0.5 improvement of approximately 34%. For the contamination class, Mask R-CNN achieves an AP@0.5 roughly 42% higher than Faster R-CNN. These consistent gains demonstrate that our proposed methodology to generate defect datasets with pixel-level annotations is feasible for robust AI-based Measurement/Inspection (MI) in semiconductor fabrication.
Related papers
- HistoART: Histopathology Artifact Detection and Reporting Tool [37.31105955164019]
Whole Slide Imaging (WSI) is widely used to digitize tissue specimens for detailed, high-resolution examination.<n>WSI remains vulnerable to artifacts introduced during slide preparation and scanning.<n>We propose and compare three robust artifact detection approaches for WSIs.
arXiv Detail & Related papers (2025-06-23T17:22:19Z) - Evaluating and Predicting Distorted Human Body Parts for Generated Images [44.49888268318722]
We propose ViT-HD, a Vision Transformer-based model tailored for detecting human body distortions in AI-generated images.<n>We construct the Human Distortion Benchmark with 500 human-centric prompts to evaluate four popular T2I models.<n>This work pioneers a systematic approach to evaluating anatomical accuracy in AI-generated humans, offering tools to advance the fidelity of T2I models.
arXiv Detail & Related papers (2025-03-02T09:34:44Z) - MaskTerial: A Foundation Model for Automated 2D Material Flake Detection [48.73213960205105]
We present a deep learning model, called MaskTerial, that uses an instance segmentation network to reliably identify 2D material flakes.<n>The model is extensively pre-trained using a synthetic data generator, that generates realistic microscopy images from unlabeled data.<n>We demonstrate significant improvements over existing techniques in the detection of low-contrast materials such as hexagonal boron nitride.
arXiv Detail & Related papers (2024-12-12T15:01:39Z) - Addressing Class Imbalance and Data Limitations in Advanced Node Semiconductor Defect Inspection: A Generative Approach for SEM Images [0.10555513406636088]
We propose a method for generating synthetic semiconductor SEM images using a diffusion model within a limited data regime.
In contrast to images generated through conventional simulation methods, SEM images generated through our proposed DL method closely resemble real SEM images, replicating their noise characteristics and surface roughness adaptively.
arXiv Detail & Related papers (2024-07-14T22:25:05Z) - Solving Energy-Independent Density for CT Metal Artifact Reduction via Neural Representation [46.57879724994237]
Reconstructing CT images from metal-corrupted measurements becomes a challenging nonlinear inverse problem.<n>Existing state-of-the-art (SOTA) metal artifact reduction (MAR) algorithms rely on supervised learning with numerous paired CT samples.<n>In this work, we propose Density neural representation (Diner), a novel unsupervised MAR method.
arXiv Detail & Related papers (2024-05-11T16:30:39Z) - Generative Model-Driven Synthetic Training Image Generation: An Approach
to Cognition in Rail Defect Detection [12.584718477246382]
This study proposes a VAE-based synthetic image generation technique for rail defects.
It is applied to create a synthetic dataset for the Canadian Pacific Railway.
500 synthetic samples are generated with a minimal reconstruction loss of 0.021.
arXiv Detail & Related papers (2023-12-31T04:34:58Z) - ORTexME: Occlusion-Robust Human Shape and Pose via Temporal Average
Texture and Mesh Encoding [35.49066795648395]
In 3D human shape and pose estimation from a monocular video, models trained with limited labeled data cannot generalize well to videos with occlusion.
We introduce ORTexME, an occlusion-robust temporal method that utilizes temporal information from the input video to better regularize the occluded body parts.
Our method achieves significant improvement on the challenging multi-person 3DPW dataset, where our method achieves 1.8 P-MPJPE error reduction.
arXiv Detail & Related papers (2023-09-21T15:50:04Z) - Automated Semiconductor Defect Inspection in Scanning Electron
Microscope Images: a Systematic Review [4.493547775253646]
Machine learning algorithms can be trained to accurately classify and locate defects in semiconductor samples.
Convolutional neural networks have proved to be particularly useful in this regard.
This systematic review provides a comprehensive overview of the state of automated semiconductor defect inspection on SEM images.
arXiv Detail & Related papers (2023-08-16T13:59:43Z) - Enhanced Sharp-GAN For Histopathology Image Synthesis [63.845552349914186]
Histopathology image synthesis aims to address the data shortage issue in training deep learning approaches for accurate cancer detection.
We propose a novel approach that enhances the quality of synthetic images by using nuclei topology and contour regularization.
The proposed approach outperforms Sharp-GAN in all four image quality metrics on two datasets.
arXiv Detail & Related papers (2023-01-24T17:54:01Z) - An Adversarial Active Sampling-based Data Augmentation Framework for
Manufacturable Chip Design [55.62660894625669]
Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable.
Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks.
We propose a litho-aware data augmentation framework to resolve the dilemma of limited data and improve the machine learning model performance.
arXiv Detail & Related papers (2022-10-27T20:53:39Z) - N-pad : Neighboring Pixel-based Industrial Anomaly Detection [0.0]
We present textittextbfN-pad, a novel method for anomaly detection and segmentation in a one-class learning setting.
We have achieved state-of-the-art performance in MVTec-AD with AUROC of 99.37 for anomaly detection and 98.75 for anomaly segmentation.
arXiv Detail & Related papers (2022-10-17T06:22:16Z) - Negligible effect of brain MRI data preprocessing for tumor segmentation [36.89606202543839]
We conduct experiments on three publicly available datasets and evaluate the effect of different preprocessing steps in deep neural networks.
Our results demonstrate that most popular standardization steps add no value to the network performance.
We suggest that image intensity normalization approaches do not contribute to model accuracy because of the reduction of signal variance with image standardization.
arXiv Detail & Related papers (2022-04-11T17:29:36Z) - Performance, Successes and Limitations of Deep Learning Semantic
Segmentation of Multiple Defects in Transmission Electron Micrographs [9.237363938772479]
We perform semantic segmentation of defect types in electron microscopy images of irradiated FeCrAl alloys using a deep learning Mask Regional Convolutional Neural Network (Mask R-CNN) model.
We conduct an in-depth analysis of key model performance statistics, with a focus on quantities such as predicted distributions of defect shapes, defect sizes, and defect areal densities.
Overall, we find that the current model is a fast, effective tool for automatically characterizing and quantifying multiple defect types in microscopy images.
arXiv Detail & Related papers (2021-10-15T17:57:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.