LLM4PLC: Harnessing Large Language Models for Verifiable Programming of
PLCs in Industrial Control Systems
- URL: http://arxiv.org/abs/2401.05443v1
- Date: Mon, 8 Jan 2024 23:52:42 GMT
- Title: LLM4PLC: Harnessing Large Language Models for Verifiable Programming of
PLCs in Industrial Control Systems
- Authors: Mohamad Fakih, Rahul Dharmaji, Yasamin Moghaddas, Gustavo Quiros
Araya, Oluwatosin Ogundare, and Mohammad Abdullah Al Faruque
- Abstract summary: Large Language Models (LLMs) fail to produce valid programs for Industrial Control Systems (ICS) operated by Programmable Logic Controllers (PLCs)
We propose a user-guided iterative pipeline leveraging user feedback and external verification tools including grammar checkers, compilers and SMV verifiers.
We run a complete test suite on GPT-3.5, GPT-4, Code Llama-7B, a fine-tuned Code Llama-7B model, Code Llama-34B, and a fine-tuned Code Llama-34B model.
- Score: 9.946058168276744
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Although Large Language Models (LLMs) have established pre-dominance in
automated code generation, they are not devoid of shortcomings. The pertinent
issues primarily relate to the absence of execution guarantees for generated
code, a lack of explainability, and suboptimal support for essential but niche
programming languages. State-of-the-art LLMs such as GPT-4 and LLaMa2 fail to
produce valid programs for Industrial Control Systems (ICS) operated by
Programmable Logic Controllers (PLCs). We propose LLM4PLC, a user-guided
iterative pipeline leveraging user feedback and external verification tools
including grammar checkers, compilers and SMV verifiers to guide the LLM's
generation. We further enhance the generation potential of LLM by employing
Prompt Engineering and model fine-tuning through the creation and usage of
LoRAs. We validate this system using a FischerTechnik Manufacturing TestBed
(MFTB), illustrating how LLMs can evolve from generating structurally flawed
code to producing verifiably correct programs for industrial applications. We
run a complete test suite on GPT-3.5, GPT-4, Code Llama-7B, a fine-tuned Code
Llama-7B model, Code Llama-34B, and a fine-tuned Code Llama-34B model. The
proposed pipeline improved the generation success rate from 47% to 72%, and the
Survey-of-Experts code quality from 2.25/10 to 7.75/10. To promote open
research, we share the complete experimental setup, the LLM Fine-Tuning
Weights, and the video demonstrations of the different programs on our
dedicated webpage.
Related papers
- Exploring Code Language Models for Automated HLS-based Hardware Generation: Benchmark, Infrastructure and Analysis [49.998130983414924]
Large language models (LLMs) can be employed for programming languages such as Python and C++.
This paper explores leveraging LLMs to generate High-Level Synthesis (HLS)-based hardware design.
arXiv Detail & Related papers (2025-02-19T17:53:59Z) - LLM2: Let Large Language Models Harness System 2 Reasoning [65.89293674479907]
Large language models (LLMs) have exhibited impressive capabilities across a myriad of tasks, yet they occasionally yield undesirable outputs.
We introduce LLM2, a novel framework that combines an LLM with a process-based verifier.
LLMs2 is responsible for generating plausible candidates, while the verifier provides timely process-based feedback to distinguish desirable and undesirable outputs.
arXiv Detail & Related papers (2024-12-29T06:32:36Z) - Planning-Driven Programming: A Large Language Model Programming Workflow [8.827173113748701]
Large language models (LLMs) are strong performers in code generation.
Recent research suggests continuous program refinements through visible tests to improve code generation accuracy in LLMs.
We propose an LLM programming workflow (LPW) designed to improve both initial code generation and subsequent refinements.
arXiv Detail & Related papers (2024-11-21T08:31:06Z) - PerfCodeGen: Improving Performance of LLM Generated Code with Execution Feedback [78.89596149768458]
Large Language Models (LLMs) are widely adopted for assisting in software development tasks.
We propose PerfCodeGen, a training-free framework that enhances the performance of LLM-generated code.
arXiv Detail & Related papers (2024-11-18T06:22:38Z) - Precision or Peril: Evaluating Code Quality from Quantized Large Language Models [0.5249805590164902]
Quantization has emerged as a way to mitigate the memory overhead of Large Language Models.
This study aims to evaluate the current code generation capabilities of smaller LLMs using various metrics.
arXiv Detail & Related papers (2024-11-16T01:31:29Z) - OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems.
While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited.
We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z) - Agents4PLC: Automating Closed-loop PLC Code Generation and Verification in Industrial Control Systems using LLM-based Agents [27.097029139195943]
Agents4PLC is a novel framework that automates PLC code generation and code-level verification.
We first establish a benchmark for verifiable PLC code generation area.
We then transition from natural language requirements to human-written-verified formal specifications and reference PLC code.
arXiv Detail & Related papers (2024-10-18T06:51:13Z) - Combining LLM Code Generation with Formal Specifications and Reactive Program Synthesis [0.7580487359358722]
Large Language Models (LLMs) struggle with accuracy and are unsuitable for high-risk applications.
We introduce a solution that divides the code generation into two parts; one to be handled by an LLM and one to be handled by formal methods-based program synthesis.
arXiv Detail & Related papers (2024-09-18T15:59:06Z) - How Well Do Large Language Models Serve as End-to-End Secure Code Producers? [42.119319820752324]
We studied GPT-3.5 and GPT-4's capability to identify and repair vulnerabilities in the code generated by four popular LLMs.
By manually or automatically reviewing 4,900 pieces of code, our study reveals that large language models lack awareness of scenario-relevant security risks.
To address the limitation of a single round of repair, we developed a lightweight tool that prompts LLMs to construct safer source code.
arXiv Detail & Related papers (2024-08-20T02:42:29Z) - InfiBench: Evaluating the Question-Answering Capabilities of Code Large Language Models [56.723509505549536]
InfiBench is the first large-scale freeform question-answering (QA) benchmark for code to our knowledge.
It comprises 234 carefully selected high-quality Stack Overflow questions that span across 15 programming languages.
We conduct a systematic evaluation for over 100 latest code LLMs on InfiBench, leading to a series of novel and insightful findings.
arXiv Detail & Related papers (2024-03-11T02:06:30Z) - StepCoder: Improve Code Generation with Reinforcement Learning from
Compiler Feedback [58.20547418182074]
We introduce StepCoder, a novel framework for code generation, consisting of two main components.
CCCS addresses the exploration challenge by breaking the long sequences code generation task into a Curriculum of Code Completion Subtasks.
FGO only optimize the model by masking the unexecuted code segments to provide Fine-Grained Optimization.
Our method improves the ability to explore the output space and outperforms state-of-the-art approaches in corresponding benchmarks.
arXiv Detail & Related papers (2024-02-02T13:14:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.