Related papers: On Codex Prompt Engineering for OCL Generation: An Empirical Study

On Codex Prompt Engineering for OCL Generation: An Empirical Study

URL: http://arxiv.org/abs/2303.16244v1
Date: Tue, 28 Mar 2023 18:50:51 GMT
Title: On Codex Prompt Engineering for OCL Generation: An Empirical Study
Authors: Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
Abstract summary: The Object Constraint Language (OCL) is a declarative language that adds constraints and object query expressions to MOF models. Recent advancements in LLMs, such as GPT-3, have shown their capability in many NLP tasks. We investigate the reliability of OCL constraints generated by Codex from natural language specifications.
Score: 10.184056098238765
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The Object Constraint Language (OCL) is a declarative language that adds constraints and object query expressions to MOF models. Despite its potential to provide precision and conciseness to UML models, the unfamiliar syntax of OCL has hindered its adoption. Recent advancements in LLMs, such as GPT-3, have shown their capability in many NLP tasks, including semantic parsing and text generation. Codex, a GPT-3 descendant, has been fine-tuned on publicly available code from GitHub and can generate code in many programming languages. We investigate the reliability of OCL constraints generated by Codex from natural language specifications. To achieve this, we compiled a dataset of 15 UML models and 168 specifications and crafted a prompt template with slots to populate with UML information and the target task, using both zero- and few-shot learning methods. By measuring the syntactic validity and execution accuracy metrics of the generated OCL constraints, we found that enriching the prompts with UML information and enabling few-shot learning increases the reliability of the generated OCL constraints. Furthermore, the results reveal a close similarity based on sentence embedding between the generated OCL constraints and the human-written ones in the ground truth, implying a level of clarity and understandability in the generated OCL constraints by Codex.

Related papers

Cost-Effective Text Clustering with Large Language Models [15.179854529085544]
This paper proposes TECL, a cost-effective framework that taps into the feedback from large language models for accurate text clustering. Under the hood, TECL adopts our EdgeLLM or TriangleLLM to construct must-link/cannot-link constraints for text pairs. Our experiments on multiple benchmark datasets exhibit that TECL consistently and considerably outperforms existing solutions in unsupervised text clustering.
arXiv Detail & Related papers (2025-04-22T06:57:49Z)
An Empirical Study on Commit Message Generation using LLMs via In-Context Learning [26.39743339039473]
Commit messages concisely describe code changes in natural language. We propose to borrow the weapon of large language models (LLMs) and in-context learning (ICL) to generate commit messages.
arXiv Detail & Related papers (2025-02-26T07:47:52Z)
Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks [1.9198713957364215]
Large Language Models (LLMs) have shown remarkable capabilities in code generation tasks. They face significant limitations in handling complex, long-context programming challenges. This paper introduces a novel agentic framework for guided code generation''
arXiv Detail & Related papers (2025-01-11T19:21:53Z)
Linguistics Theory Meets LLM: Code-Switched Text Generation via Equivalence Constrained Large Language Models [16.82812708514889]
Code-switching, the phenomenon of alternating between two or more languages in a single conversation, presents unique challenges for Natural Language Processing (NLP) Most existing research focuses on either syntactic constraints or neural generation, with few efforts to integrate linguistic theory with large language models (LLMs) for generating natural code-switched text. We introduce EZSwitch, a novel framework that combines Equivalence Constraint Theory (ECT) with LLMs to produce linguistically valid and fluent code-switched text.
arXiv Detail & Related papers (2024-10-30T03:03:32Z)
ULLME: A Unified Framework for Large Language Model Embeddings with Generation-Augmented Learning [72.90823351726374]
We introduce the Unified framework for Large Language Model Embedding (ULLME), a flexible, plug-and-play implementation that enables bidirectional attention across various LLMs. We also propose Generation-augmented Representation Learning (GRL), a novel fine-tuning method to boost LLMs for text embedding tasks. To showcase our framework's flexibility and effectiveness, we release three pre-trained models from ULLME with different backbone architectures.
arXiv Detail & Related papers (2024-08-06T18:53:54Z)
Combining Constraint Programming Reasoning with Large Language Model Predictions [44.99833362998488]
Constraint Programming (CP) and Machine Learning (ML) face challenges in text generation. This paper proposes a solution by combining both approaches and embedding a Large Language Model (LLM) in CP.
arXiv Detail & Related papers (2024-07-18T13:15:55Z)
PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4 [10.564949684320727]
We introduce PathOCL, a novel path-based prompt augmentation technique designed to facilitate Object Constraint Language generation. Our findings demonstrate that PathOCL, compared to augmenting the complete class model (UML-Augmentation), generates a higher number of valid and correct OCL constraints.
arXiv Detail & Related papers (2024-05-21T02:00:54Z)
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code) Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z)
FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models [79.62191017182518]
FollowBench is a benchmark for Fine-grained Constraints Following Benchmark for Large Language Models. We introduce a Multi-level mechanism that incrementally adds a single constraint to the initial instruction at each increased level. By evaluating 13 popular LLMs on FollowBench, we highlight the weaknesses of LLMs in instruction following and point towards potential avenues for future work.
arXiv Detail & Related papers (2023-10-31T12:32:38Z)
Evaluating, Understanding, and Improving Constrained Text Generation for Large Language Models [49.74036826946397]
This study investigates constrained text generation for large language models (LLMs) Our research mainly focuses on mainstream open-source LLMs, categorizing constraints into lexical, structural, and relation-based types. Results illuminate LLMs' capacity and deficiency to incorporate constraints and provide insights for future developments in constrained text generation.
arXiv Detail & Related papers (2023-10-25T03:58:49Z)
Can Large Language Models Understand Real-World Complex Instructions? [54.86632921036983]
Large language models (LLMs) can understand human instructions, but struggle with complex instructions. Existing benchmarks are insufficient to assess LLMs' ability to understand complex instructions. We propose CELLO, a benchmark for evaluating LLMs' ability to follow complex instructions systematically.
arXiv Detail & Related papers (2023-09-17T04:18:39Z)
Inductive-bias Learning: Generating Code Models with Large Language Model [0.0]
Large Language Models (LLMs) have been attracting attention due to a ability called in-context learning(ICL) We propose a novel learning'' method called an Inductive-Bias Learning (IBL)'', which combines the techniques of ICL and code generation.
arXiv Detail & Related papers (2023-08-19T03:01:45Z)
LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset. Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.