Maximizing information from chemical engineering data sets: Applications
to machine learning
- URL: http://arxiv.org/abs/2201.10035v1
- Date: Tue, 25 Jan 2022 01:25:45 GMT
- Title: Maximizing information from chemical engineering data sets: Applications
to machine learning
- Authors: Alexander Thebelt, Johannes Wiebe, Jan Kronqvist, Calvin Tsay, Ruth
Misener
- Abstract summary: We identify four characteristics of data arising in chemical engineering applications that make applying classical artificial intelligence approaches difficult.
For each of these data characteristics, we discuss applications where these data characteristics arise and show how current chemical engineering research is extending the fields of data science and machine learning to incorporate these challenges.
- Score: 61.442473332320176
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is well-documented how artificial intelligence can have (and already is
having) a big impact on chemical engineering. But classical machine learning
approaches may be weak for many chemical engineering applications. This review
discusses how challenging data characteristics arise in chemical engineering
applications. We identify four characteristics of data arising in chemical
engineering applications that make applying classical artificial intelligence
approaches difficult: (1) high variance, low volume data, (2) low variance,
high volume data, (3) noisy/corrupt/missing data, and (4) restricted data with
physics-based limitations. For each of these four data characteristics, we
discuss applications where these data characteristics arise and show how
current chemical engineering research is extending the fields of data science
and machine learning to incorporate these challenges. Finally, we identify
several challenges for future research.
Related papers
- Obtaining physical layer data of latest generation networks for investigating adversary attacks [0.0]
Machine learning can be used to optimize the functions of latest generation data networks such as 5G and 6G.
adversarial measures that manipulate the behaviour of intelligent machine learning models are becoming a major concern.
A simulation model is proposed that works in conjunction with machine learning applications.
arXiv Detail & Related papers (2024-05-02T06:03:27Z) - An Autonomous Large Language Model Agent for Chemical Literature Data
Mining [60.85177362167166]
We introduce an end-to-end AI agent framework capable of high-fidelity extraction from extensive chemical literature.
Our framework's efficacy is evaluated using accuracy, recall, and F1 score of reaction condition data.
arXiv Detail & Related papers (2024-02-20T13:21:46Z) - Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis [57.70772230913099]
Chemist-X automates the reaction condition recommendation (RCR) task in chemical synthesis with retrieval-augmented generation (RAG) technology.
Chemist-X interrogates online molecular databases and distills critical data from the latest literature database.
Chemist-X considerably reduces chemists' workload and allows them to focus on more fundamental and creative problems.
arXiv Detail & Related papers (2023-11-16T01:21:33Z) - How to Do Machine Learning with Small Data? -- A Review from an
Industrial Perspective [1.443696537295348]
Authors focus on interpreting the general term of "small data" and their engineering and industrial application role.
Small data is defined in terms of various characteristics compared to big data, and a machine learning formalism was introduced.
Five critical challenges of machine learning with small data in industrial applications are presented.
arXiv Detail & Related papers (2023-11-13T07:39:13Z) - ChemVise: Maximizing Out-of-Distribution Chemical Detection with the
Novel Application of Zero-Shot Learning [60.02503434201552]
This research proposes learning approximations of complex exposures from training sets of simple ones.
We demonstrate this approach to synthetic sensor responses surprisingly improves the detection of out-of-distribution obscured chemical analytes.
arXiv Detail & Related papers (2023-02-09T20:19:57Z) - Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models.
The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z) - A Review into Data Science and Its Approaches in Mechanical Engineering [0.0]
This article briefly introduced data science and reviewed its methods.
In the introduction, different definitions of data science and its background in technology reviewed.
Some researches in the mechanical engineering area that used data science methods in their studies are reviewed.
arXiv Detail & Related papers (2020-12-30T23:05:29Z) - Machine Learning in Nano-Scale Biomedical Engineering [77.75587007080894]
We review the existing research regarding the use of machine learning in nano-scale biomedical engineering.
The main challenges that can be formulated as ML problems are classified into the three main categories.
For each of the presented methodologies, special emphasis is given to its principles, applications, and limitations.
arXiv Detail & Related papers (2020-08-05T15:45:54Z) - Data science on industrial data -- Today's challenges in brown field
applications [0.0]
This paper shows state of the art and what to expect when working with stock machines in the field.
A major focus in this paper is on data collection which can be more cumbersome than most people might expect.
Data quality for machine learning applications is a challenge once leaving the laboratory.
arXiv Detail & Related papers (2020-06-10T10:05:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.