Machine Learning based CVD Virtual Metrology in Mass Produced
Semiconductor Process
- URL: http://arxiv.org/abs/2107.05071v1
- Date: Sun, 11 Jul 2021 15:32:31 GMT
- Title: Machine Learning based CVD Virtual Metrology in Mass Produced
Semiconductor Process
- Authors: Yunsong Xie, Ryan Stearrett
- Abstract summary: A cross-benchmark has been done on three critical aspects, data imputing, feature selection and regression algorithms.
The result reveals that linear feature selection regression algorithm would extensively under-fit the VM data.
Data imputing is also necessary to achieve a higher prediction accuracy as the data availability is only 70% when optimal accuracy is obtained.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A cross-benchmark has been done on three critical aspects, data imputing,
feature selection and regression algorithms, for machine learning based
chemical vapor deposition (CVD) virtual metrology (VM). The result reveals that
linear feature selection regression algorithm would extensively under-fit the
VM data. Data imputing is also necessary to achieve a higher prediction
accuracy as the data availability is only ~70% when optimal accuracy is
obtained. This work suggests a nonlinear feature selection and regression
algorithm combined with nearest data imputing algorithm would provide a
prediction accuracy as high as 0.7. This would lead to 70% reduced CVD
processing variation, which is believed to will lead to reduced frequency of
physical metrology as well as more reliable mass-produced wafer with improved
quality.
Related papers
- Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference [55.150117654242706]
We show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU.
As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty.
arXiv Detail & Related papers (2024-11-01T21:11:48Z) - An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data.
In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models.
We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - A machine learning approach to the prediction of heat-transfer
coefficients in micro-channels [4.724825031148412]
The accurate prediction of the two-phase heat transfer coefficient (HTC) is key to the optimal design and operation of compact heat exchangers.
We use a multi-output Gaussian process regression (GPR) to estimate the HTC in microchannels as a function of the mass flow rate, heat flux, system pressure and channel diameter and length.
arXiv Detail & Related papers (2023-05-28T15:48:01Z) - Post-training Model Quantization Using GANs for Synthetic Data
Generation [57.40733249681334]
We investigate the use of synthetic data as a substitute for the calibration with real data for the quantization method.
We compare the performance of models quantized using data generated by StyleGAN2-ADA and our pre-trained DiStyleGAN, with quantization using real data and an alternative data generation method based on fractal images.
arXiv Detail & Related papers (2023-05-10T11:10:09Z) - Classification and Self-Supervised Regression of Arrhythmic ECG Signals
Using Convolutional Neural Networks [13.025714736073489]
We propose a deep neural network model capable of solving regression and classification tasks.
We tested the model on the MIT-BIH Arrhythmia database.
arXiv Detail & Related papers (2022-10-25T18:11:13Z) - Survival Prediction of Children Undergoing Hematopoietic Stem Cell
Transplantation Using Different Machine Learning Classifiers by Performing
Chi-squared Test and Hyper-parameter Optimization: A Retrospective Analysis [4.067706269490143]
An efficient survival classification model is presented in a comprehensive manner.
A synthetic dataset is generated by imputing the missing values, transforming the data using dummy variable encoding, and compressing the dataset from 59 features to the 11 most correlated features using Chi-squared feature selection.
Several supervised ML methods were trained in this regard, like Decision Tree, Random Forest, Logistic Regression, K-Nearest Neighbors, Gradient Boosting, Ada Boost, and XG Boost.
arXiv Detail & Related papers (2022-01-22T08:01:22Z) - SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory.
We propose StreaMRAK - a streaming version of KRR.
We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z) - Flow based features and validation metric for machine learning
reconstruction of PIV data [0.0]
Reconstruction of flow field from real data by a physics-oriented approach is a current challenge for fluid scientists in the AI community.
The present article applies machine learning approach to study contribution of different flow-based features.
A metric is proposed that reflects mass conservation law as an important requirement for a physical flow reproduction.
arXiv Detail & Related papers (2021-05-27T20:05:41Z) - Predicting Training Time Without Training [120.92623395389255]
We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function.
We leverage the fact that the training dynamics of a deep network during fine-tuning are well approximated by those of a linearized model.
We are able to predict the time it takes to fine-tune a model to a given loss without having to perform any training.
arXiv Detail & Related papers (2020-08-28T04:29:54Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.