A Benchmark Time Series Dataset for Semiconductor Fabrication Manufacturing Constructed using Component-based Discrete-Event Simulation Models
- URL: http://arxiv.org/abs/2408.09307v1
- Date: Sat, 17 Aug 2024 23:05:47 GMT
- Title: A Benchmark Time Series Dataset for Semiconductor Fabrication Manufacturing Constructed using Component-based Discrete-Event Simulation Models
- Authors: Vamsi Krishna Pendyala, Hessam S. Sarjoughian, Bala Potineni, Edward J. Yellig,
- Abstract summary: This research is based on a benchmark model of an Intel semiconductor fabrication factory.
The time series dataset is constructed using discrete-event time trajectories.
The dataset can also be utilized in the machine learning community for behavioral analysis.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Advancements in high-computing devices increase the necessity for improved and new understanding and development of smart manufacturing factories. Discrete-event models with simulators have been shown to be critical to architect, designing, building, and operating the manufacturing of semiconductor chips. The diffusion, implantation, and lithography machines have intricate processes due to their feedforward and feedback connectivity. The dataset collected from simulations of the factory models holds the promise of generating valuable machine-learning models. As surrogate data-based models, their executions are highly efficient compared to the physics-based counterpart models. For the development of surrogate models, it is beneficial to have publicly available benchmark simulation models that are grounded in factory models that have concise structures and accurate behaviors. Hence, in this research, a dataset is devised and constructed based on a benchmark model of an Intel semiconductor fabrication factory. The model is formalized using the Parallel Discrete-Event System Specification and executed using the DEVS-Suite simulator. The time series dataset is constructed using discrete-event time trajectories. This dataset is further analyzed and used to develop baseline univariate and multivariate machine learning models. The dataset can also be utilized in the machine learning community for behavioral analysis based on formalized and scalable component-based discrete-event models and simulations.
Related papers
- DoMINO: A Decomposable Multi-scale Iterative Neural Operator for Modeling Large Scale Engineering Simulations [2.300471499347615]
DoMINO is a point cloudbased machine learning model that uses local geometric information to predict flow fields on discrete points.
DoMINO is validated for the automotive aerodynamics use case using the DrivAerML dataset.
arXiv Detail & Related papers (2025-01-23T03:28:10Z) - GauSim: Registering Elastic Objects into Digital World by Gaussian Simulator [55.02281855589641]
GauSim is a novel neural network-based simulator designed to capture the dynamic behaviors of real-world elastic objects represented through Gaussian kernels.
We leverage continuum mechanics, modeling each kernel as a continuous piece of matter to account for realistic deformations without idealized assumptions.
GauSim incorporates explicit physics constraints, such as mass and momentum conservation, ensuring interpretable results and robust, physically plausible simulations.
arXiv Detail & Related papers (2024-12-23T18:58:17Z) - Generative Modeling and Data Augmentation for Power System Production Simulation [0.0]
This paper proposes a generative model-assisted approach for load forecasting under small sample scenarios.
The expanded dataset significantly reduces forecasting errors compared to the original dataset.
The diffusion model outperforms the generative adversarial model by achieving about 200 times smaller errors.
arXiv Detail & Related papers (2024-12-10T12:38:47Z) - Semantic Capability Model for the Simulation of Manufacturing Processes [38.69817856379812]
Simulations offer opportunities in the examination of manufacturing processes.
A combination of different simulations is necessary when the outputs of one simulation serve as the input parameters for another, resulting in a sequence of simulations.
An information model is introduced, which represents simulations, their capabilities to generate certain knowledge, and their respective quality criteria.
arXiv Detail & Related papers (2024-08-15T09:28:08Z) - Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a new sandbox suite tailored for integrated data-model co-development.
This sandbox provides a feedback-driven experimental platform, enabling cost-effective and guided refinement of both data and models.
arXiv Detail & Related papers (2024-07-16T14:40:07Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Device Modeling Bias in ReRAM-based Neural Network Simulations [1.5490932775843136]
Data-driven modeling approaches such as jump tables are promising to model memory devices for neural network simulations.
We study how various jump table device models impact the attained network performance estimates.
Results on a multi-layer perceptron trained on MNIST show that device models based on binning can behave unpredictably.
arXiv Detail & Related papers (2022-11-29T04:45:06Z) - Continual learning autoencoder training for a particle-in-cell
simulation via streaming [52.77024349608834]
upcoming exascale era will provide a new generation of physics simulations with high resolution.
These simulations will have a high resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible.
This work presents an approach that trains a neural network concurrently to a running simulation without data on a disk.
arXiv Detail & Related papers (2022-11-09T09:55:14Z) - VAE-LIME: Deep Generative Model Based Approach for Local Data-Driven
Model Interpretability Applied to the Ironmaking Industry [70.10343492784465]
It is necessary to expose to the process engineer, not solely the model predictions, but also their interpretability.
Model-agnostic local interpretability solutions based on LIME have recently emerged to improve the original method.
We present in this paper a novel approach, VAE-LIME, for local interpretability of data-driven models forecasting the temperature of the hot metal produced by a blast furnace.
arXiv Detail & Related papers (2020-07-15T07:07:07Z) - Hybrid modeling: Applications in real-time diagnosis [64.5040763067757]
We outline a novel hybrid modeling approach that combines machine learning inspired models and physics-based models.
We are using such models for real-time diagnosis applications.
arXiv Detail & Related papers (2020-03-04T00:44:57Z) - Predicting Multidimensional Data via Tensor Learning [0.0]
We develop a model that retains the intrinsic multidimensional structure of the dataset.
To estimate the model parameters, an Alternating Least Squares algorithm is developed.
The proposed model is able to outperform benchmark models present in the forecasting literature.
arXiv Detail & Related papers (2020-02-11T11:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.