Related papers: Toward autocorrection of chemical process flowsheets using large language models

Toward autocorrection of chemical process flowsheets using large language models

URL: http://arxiv.org/abs/2312.02873v1
Date: Tue, 5 Dec 2023 16:39:41 GMT
Title: Toward autocorrection of chemical process flowsheets using large language models
Authors: Lukas Schulze Balhorn and Marc Caballero and Artur M. Schweidtmann
Abstract summary: We propose a novel generative AI methodology for identifying errors in flowsheets and suggesting corrections to the user. The input to the model is a potentially erroneous flowsheet and the output of the model are suggestions for a corrected flowsheet. The model achieves a top-1 accuracy of 80% and a top-5 accuracy of 84% on an independent test dataset of synthetically generated flowsheets.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The process engineering domain widely uses Process Flow Diagrams (PFDs) and Process and Instrumentation Diagrams (P&IDs) to represent process flows and equipment configurations. However, the P&IDs and PFDs, hereafter called flowsheets, can contain errors causing safety hazards, inefficient operation, and unnecessary expenses. Correcting and verifying flowsheets is a tedious, manual process. We propose a novel generative AI methodology for automatically identifying errors in flowsheets and suggesting corrections to the user, i.e., autocorrecting flowsheets. Inspired by the breakthrough of Large Language Models (LLMs) for grammatical autocorrection of human language, we investigate LLMs for the autocorrection of flowsheets. The input to the model is a potentially erroneous flowsheet and the output of the model are suggestions for a corrected flowsheet. We train our autocorrection model on a synthetic dataset in a supervised manner. The model achieves a top-1 accuracy of 80% and a top-5 accuracy of 84% on an independent test dataset of synthetically generated flowsheets. The results suggest that the model can learn to autocorrect the synthetic flowsheets. We envision that flowsheet autocorrection will become a useful tool for chemical engineers.

Related papers

Exploring LLM Agents for Cleaning Tabular Machine Learning Datasets [19.844836459291546]
High-quality, error-free datasets are a key ingredient in building reliable, accurate, and unbiased machine learning (ML) models. However, real world datasets often suffer from errors due to sensor malfunctions, data entry mistakes, or improper data integration across multiple sources. In this study, we investigate whether Large Language Models (LLMs) can help alleviate the burden of manual data cleaning.
arXiv Detail & Related papers (2025-03-09T15:29:46Z)
Training Language Models to Self-Correct via Reinforcement Learning [98.35197671595343]
Self-correction has been found to be largely ineffective in modern large language models (LLMs) We develop a multi-turn online reinforcement learning approach, SCoRe, that significantly improves an LLM's self-correction ability using entirely self-generated data. We find that SCoRe achieves state-of-the-art self-correction performance, improving the base models' self-correction by 15.6% and 9.1% respectively on MATH and HumanEval.
arXiv Detail & Related papers (2024-09-19T17:16:21Z)
Machine learning surrogates for efficient hydrologic modeling: Insights from stochastic simulations of managed aquifer recharge [0.0]
We show that machine learning surrogate models can achieve under 10% mean absolute percentage error. We apply this workflow to simulations of variably saturated groundwater flow at a prospective managed aquifer recharge site. ML surrogate models can achieve under 10% mean absolute percentage error and yield order-of-magnitude runtime savings.
arXiv Detail & Related papers (2024-07-30T15:24:27Z)
Small Language Models Need Strong Verifiers to Self-Correct Reasoning [69.94251699982388]
Self-correction has emerged as a promising solution to boost the reasoning performance of large language models (LLMs) This work explores whether small (= 13B) language models (LMs) have the ability of self-correction on reasoning tasks with minimal inputs from stronger LMs.
arXiv Detail & Related papers (2024-04-26T03:41:28Z)
LM-Combiner: A Contextual Rewriting Model for Chinese Grammatical Error Correction [49.0746090186582]
Over-correction is a critical problem in Chinese grammatical error correction (CGEC) task. Recent work using model ensemble methods can effectively mitigate over-correction and improve the precision of the GEC system. We propose the LM-Combiner, a rewriting model that can directly modify the over-correction of GEC system outputs without a model ensemble.
arXiv Detail & Related papers (2024-03-26T06:12:21Z)
Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models [5.463333911506443]
We aim to enhance the self-checking capabilities of large language models (LLMs) by constructing training data for checking tasks. We propose a specialized checking format called "Step CoT Check" Experiments demonstrate that fine-tuning with the "Step CoT Check" format significantly improves the self-checking and self-correction abilities of LLMs.
arXiv Detail & Related papers (2024-02-20T14:23:23Z)
Parameter-tuning-free data entry error unlearning with adaptive selective synaptic dampening [51.34904967046097]
We introduce an extension to the selective synaptic dampening unlearning method that removes the need for parameter tuning. We demonstrate the performance of this extension, adaptive selective synaptic dampening (ASSD) on various ResNet18 and Vision Transformer unlearning tasks. The application of this approach is particularly compelling in industrial settings, such as supply chain management.
arXiv Detail & Related papers (2024-02-06T14:04:31Z)
The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation [93.01964988474755]
AutoMQM is a prompting technique which asks large language models to identify and categorize errors in translations. We study the impact of labeled data through in-context learning and finetuning. We then evaluate AutoMQM with PaLM-2 models, and we find that it improves performance compared to just prompting for scores.
arXiv Detail & Related papers (2023-08-14T17:17:21Z)
Data augmentation for machine learning of chemical process flowsheets [0.0]
We show that proposed data augmentation improves the performance of artificial intelligence-based process design models. In our case study flowsheet data augmentation improved the prediction uncertainty of the flowsheet autocompletion model by 14.7%.
arXiv Detail & Related papers (2023-02-07T10:35:24Z)
Learning from flowsheets: A generative transformer model for autocompletion of flowsheets [0.0]
We represent flowsheets as strings using the text-based SFILES 2.0 notation. We learn the grammatical structure of the SFILES 2.0 language and common patterns in flowsheets using a transformer-based language model.
arXiv Detail & Related papers (2022-08-01T13:43:58Z)
AutoFlow: Learning a Better Training Set for Optical Flow [62.40293188964933]
AutoFlow is a method to render training data for optical flow. AutoFlow achieves state-of-the-art accuracy in pre-training both PWC-Net and RAFT.
arXiv Detail & Related papers (2021-04-29T17:55:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.