A Benchmark Time Series Dataset for Semiconductor Fabrication Manufacturing Constructed using Component-based Discrete-Event Simulation Models
- URL: http://arxiv.org/abs/2408.09307v1
- Date: Sat, 17 Aug 2024 23:05:47 GMT
- Title: A Benchmark Time Series Dataset for Semiconductor Fabrication Manufacturing Constructed using Component-based Discrete-Event Simulation Models
- Authors: Vamsi Krishna Pendyala, Hessam S. Sarjoughian, Bala Potineni, Edward J. Yellig,
- Abstract summary: This research is based on a benchmark model of an Intel semiconductor fabrication factory.
The time series dataset is constructed using discrete-event time trajectories.
The dataset can also be utilized in the machine learning community for behavioral analysis.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Advancements in high-computing devices increase the necessity for improved and new understanding and development of smart manufacturing factories. Discrete-event models with simulators have been shown to be critical to architect, designing, building, and operating the manufacturing of semiconductor chips. The diffusion, implantation, and lithography machines have intricate processes due to their feedforward and feedback connectivity. The dataset collected from simulations of the factory models holds the promise of generating valuable machine-learning models. As surrogate data-based models, their executions are highly efficient compared to the physics-based counterpart models. For the development of surrogate models, it is beneficial to have publicly available benchmark simulation models that are grounded in factory models that have concise structures and accurate behaviors. Hence, in this research, a dataset is devised and constructed based on a benchmark model of an Intel semiconductor fabrication factory. The model is formalized using the Parallel Discrete-Event System Specification and executed using the DEVS-Suite simulator. The time series dataset is constructed using discrete-event time trajectories. This dataset is further analyzed and used to develop baseline univariate and multivariate machine learning models. The dataset can also be utilized in the machine learning community for behavioral analysis based on formalized and scalable component-based discrete-event models and simulations.
Related papers
- Semantic Capability Model for the Simulation of Manufacturing Processes [38.69817856379812]
Simulations offer opportunities in the examination of manufacturing processes.
A combination of different simulations is necessary when the outputs of one simulation serve as the input parameters for another, resulting in a sequence of simulations.
An information model is introduced, which represents simulations, their capabilities to generate certain knowledge, and their respective quality criteria.
arXiv Detail & Related papers (2024-08-15T09:28:08Z) - Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a novel sandbox suite tailored for integrated data-model co-development.
This sandbox provides a comprehensive experimental platform, enabling rapid iteration and insight-driven refinement of both data and models.
We also uncover fruitful insights gleaned from exhaustive benchmarks, shedding light on the critical interplay between data quality, diversity, and model behavior.
arXiv Detail & Related papers (2024-07-16T14:40:07Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Visual Deformation Detection Using Soft Material Simulation for Pre-training of Condition Assessment Models [3.0477617036157136]
It proposes using Blender, an open-source simulation tool, to create synthetic datasets for machine learning (ML) models.
The process involves translating expert information into shape key parameters to simulate deformations, generating images for both deformed and non-deformed objects.
arXiv Detail & Related papers (2024-04-02T01:58:53Z) - Device Modeling Bias in ReRAM-based Neural Network Simulations [1.5490932775843136]
Data-driven modeling approaches such as jump tables are promising to model memory devices for neural network simulations.
We study how various jump table device models impact the attained network performance estimates.
Results on a multi-layer perceptron trained on MNIST show that device models based on binning can behave unpredictably.
arXiv Detail & Related papers (2022-11-29T04:45:06Z) - Continual learning autoencoder training for a particle-in-cell
simulation via streaming [52.77024349608834]
upcoming exascale era will provide a new generation of physics simulations with high resolution.
These simulations will have a high resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible.
This work presents an approach that trains a neural network concurrently to a running simulation without data on a disk.
arXiv Detail & Related papers (2022-11-09T09:55:14Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Discovering Generative Models from Event Logs: Data-driven Simulation vs
Deep Learning [0.6338178373376447]
A generative model is a statistical model that is able to generate new data instances from previously observed ones.
This paper empirically compares a data-driven simulation technique with multiple deep learning techniques, which construct models are capable of generating execution traces with timestamped events.
arXiv Detail & Related papers (2020-09-08T08:04:06Z) - VAE-LIME: Deep Generative Model Based Approach for Local Data-Driven
Model Interpretability Applied to the Ironmaking Industry [70.10343492784465]
It is necessary to expose to the process engineer, not solely the model predictions, but also their interpretability.
Model-agnostic local interpretability solutions based on LIME have recently emerged to improve the original method.
We present in this paper a novel approach, VAE-LIME, for local interpretability of data-driven models forecasting the temperature of the hot metal produced by a blast furnace.
arXiv Detail & Related papers (2020-07-15T07:07:07Z) - Hybrid modeling: Applications in real-time diagnosis [64.5040763067757]
We outline a novel hybrid modeling approach that combines machine learning inspired models and physics-based models.
We are using such models for real-time diagnosis applications.
arXiv Detail & Related papers (2020-03-04T00:44:57Z) - Predicting Multidimensional Data via Tensor Learning [0.0]
We develop a model that retains the intrinsic multidimensional structure of the dataset.
To estimate the model parameters, an Alternating Least Squares algorithm is developed.
The proposed model is able to outperform benchmark models present in the forecasting literature.
arXiv Detail & Related papers (2020-02-11T11:57:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.