Locally-Deployed Chain-of-Thought (CoT) Reasoning Model in Chemical Engineering: Starting from 30 Experimental Data
- URL: http://arxiv.org/abs/2502.12383v1
- Date: Mon, 17 Feb 2025 23:43:48 GMT
- Title: Locally-Deployed Chain-of-Thought (CoT) Reasoning Model in Chemical Engineering: Starting from 30 Experimental Data
- Authors: Tianhang Zhou, Yingchun Niu, Xingying Lan, Chunming Xu,
- Abstract summary: This paper explores the application of the Chain-of-Thought (CoT) reasoning model in chemical engineering.<n>Two CoT-building methods, Large Language Model-Chain of Thought (LLM-CoT) and Machine Learning-Large Language Model-Chain of Thought (ML-LLM-CoT), are studied.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the field of chemical engineering, traditional data-processing and prediction methods face significant challenges. Machine-learning and large-language models (LLMs) also have their respective limitations. This paper explores the application of the Chain-of-Thought (CoT) reasoning model in chemical engineering, starting from 30 experimental data points. By integrating traditional surrogate models like Gaussian processes and random forests with powerful LLMs such as DeepSeek-R1, a hierarchical architecture is proposed. Two CoT-building methods, Large Language Model-Chain of Thought (LLM-CoT) and Machine Learning-Large Language Model-Chain of Thought (ML-LLM-CoT), are studied. The LLM-CoT combines local models DeepSeek-r1:14b and Qwen2:7b with Ollama. The ML-LLM-CoT integrates a pre-trained Gaussian ML model with the LLM-based CoT framework. Our results show that during construction, ML-LLM-CoT is more efficient. It only has 2 points that require rethink and a total of 4 rethink times, while LLM-CoT has 5 points that need to be re-thought and 34 total rethink times. In predicting the solubility of 20 molecules with dissimilar structures, the number of molecules with a prediction deviation higher than 100\% for the Gaussian model, LLM-CoT, and ML-LLM-CoT is 7, 6, and 4 respectively. These results indicate that ML-LLM-CoT performs better in controlling the number of high-deviation molecules, optimizing the average deviation, and achieving a higher success rate in solubility judgment, providing a more reliable method for chemical engineering and molecular property prediction. This study breaks through the limitations of traditional methods and offers new solutions for rapid property prediction and process optimization in chemical engineering.
Related papers
- Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math [135.1260782461186]
Chain-of-Thought (CoT) significantly enhances formal reasoning capabilities in Large Language Models (LLMs)
However, improving reasoning in Small Language Models (SLMs) remains challenging due to their limited model capacity.
We present a systematic training recipe for SLMs that consists of four steps: (1) large-scale mid-training on diverse distilled long-CoT data, (2) supervised fine-tuning on high-quality long-CoT data, (3) Rollout DPO leveraging a carefully curated preference dataset, and (4) Reinforcement Learning (RL) with Verifiable Reward.
arXiv Detail & Related papers (2025-04-30T00:04:35Z) - Physics-Informed Inference Time Scaling via Simulation-Calibrated Scientific Machine Learning [5.728698570173857]
High-dimensional partial differential equations (PDEs) pose significant computational challenges across fields ranging from quantum chemistry to economics and finance.
Although scientific machine learning (SciML) techniques offer approximate solutions, they often suffer from bias and neglect crucial physical insights.
We propose Simulation-Calibrated Scientific Machine Learning (SCa), a framework that dynamically refines and debiases the SCiML predictions during inference by enforcing the physical laws.
arXiv Detail & Related papers (2025-04-22T18:01:45Z) - ChatMol: A Versatile Molecule Designer Based on the Numerically Enhanced Large Language Model [11.166536730901102]
Goal-oriented de novo molecule design is a crucial yet challenging task in drug discovery.
We propose ChatMol, a novel approach that leverages Large Language Models for molecule design across diverse constraint settings.
Experimental results across single-property, substructure-property, and multi-property constrained tasks demonstrate that ChatMol consistently outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-02-27T06:05:45Z) - RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [95.32315448601241]
We propose an algorithm named Rotated Straight-Through-Estimator (RoSTE)<n>RoSTE combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy to reduce activation outliers.<n>Our findings reveal that the prediction error is directly proportional to the quantization error of the converged weights, which can be effectively managed through an optimized rotation configuration.
arXiv Detail & Related papers (2025-02-13T06:44:33Z) - LLM2: Let Large Language Models Harness System 2 Reasoning [65.89293674479907]
Large language models (LLMs) have exhibited impressive capabilities across a myriad of tasks, yet they occasionally yield undesirable outputs.<n>We introduce LLM2, a novel framework that combines an LLM with a process-based verifier.<n>LLMs2 is responsible for generating plausible candidates, while the verifier provides timely process-based feedback to distinguish desirable and undesirable outputs.
arXiv Detail & Related papers (2024-12-29T06:32:36Z) - Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning [13.082135438792475]
Chain of Self-Correction embeds self-correction as an inherent ability in Large Language Models.<n>CoSC operates through a sequence of self-correction stages.<n> Experiments show that CoSC significantly boosts performance on standard mathematical datasets.
arXiv Detail & Related papers (2024-10-14T17:16:44Z) - Quantum Kernel Learning for Small Dataset Modeling in Semiconductor Fabrication: Application to Ohmic Contact [18.42230728589117]
We develop the first application of quantum machine learning (QML) to model semiconductor process.
We report a quantum kernel-based regressor (SQKR) with a static 2-level ZZ feature map.
SQKR consistently outperformed six mainstream CML models across all evaluation metrics.
arXiv Detail & Related papers (2024-09-17T00:44:49Z) - Regression with Large Language Models for Materials and Molecular Property Prediction [0.0]
We demonstrate the ability of large language models (LLMs) to perform material and molecular property regression tasks.
We benchmark the Large Language Model Meta AI (LLaMA) 3 on several molecular properties in the QM9 dataset and 24 materials properties.
arXiv Detail & Related papers (2024-09-09T21:26:32Z) - Temperature Distribution Prediction in Laser Powder Bed Fusion using Transferable and Scalable Graph Neural Networks [0.0]
This study presents novel predictive models using Graph Neural Networks (GNNs) for simulating thermal dynamics in Laser Powder Bed Fusion processes.
The proposed models capture the complexity of the heat transfer process in L-PBF while significantly reducing computational costs.
arXiv Detail & Related papers (2024-07-18T18:14:47Z) - Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models [79.46938238953916]
Fine-tuning large language models (LLMs) to diverse applications is crucial to meet complex demands.
Recent studies suggest decomposing a fine-tuned LLM into a base model and corresponding delta weights, which are then compressed using low-rank or low-bit approaches to reduce costs.
In this work, we observe that existing low-rank and low-bit compression methods can significantly harm the model performance for task-specific fine-tuned LLMs.
arXiv Detail & Related papers (2024-06-13T07:57:27Z) - SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models [67.67135738642547]
Post-training quantization (PTQ) is a powerful compression technique investigated in large language models (LLMs)
Existing PTQ methods are not ideal in terms of accuracy and efficiency, especially with below 4 bit-widths.
This paper presents a Salience-Driven Mixed-Precision Quantization scheme for LLMs, namely SliM-LLM.
arXiv Detail & Related papers (2024-05-23T16:21:48Z) - LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit [55.73370804397226]
Quantization, a key compression technique, can effectively mitigate these demands by compressing and accelerating large language models.
We present LLMC, a plug-and-play compression toolkit, to fairly and systematically explore the impact of quantization.
Powered by this versatile toolkit, our benchmark covers three key aspects: calibration data, algorithms (three strategies), and data formats.
arXiv Detail & Related papers (2024-05-09T11:49:05Z) - BiLLM: Pushing the Limit of Post-Training Quantization for LLMs [53.31402059062365]
BiLLM is a groundbreaking 1-bit post-training quantization scheme tailored for pretrained large language models.
It achieves for the first time high-accuracy inference (e.g. 8.41 perplexity on LLaMA2-70B) with only 1.08-bit weights across various LLMs families.
arXiv Detail & Related papers (2024-02-06T09:26:34Z) - CPM-2: Large-scale Cost-effective Pre-trained Language Models [71.59893315671997]
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference.
We introduce knowledge inheritance to accelerate the pre-training process by exploiting existing PLMs instead of training models from scratch.
We implement a new inference toolkit, namely InfMoE, for using large-scale PLMs with limited computational resources.
arXiv Detail & Related papers (2021-06-20T15:43:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.