Converting Epics/Stories into Pseudocode using Transformers
- URL: http://arxiv.org/abs/2312.05047v1
- Date: Fri, 8 Dec 2023 14:01:09 GMT
- Title: Converting Epics/Stories into Pseudocode using Transformers
- Authors: Gaurav Kolhatkar, Akshit Madan, Nidhi Kowtal, Satyajit Roy, Sheetal
Sonawane
- Abstract summary: Pseudocode is a programming language representation of the steps involved in a computer program.
We present a methodology to convert a problem described in the English language into pseudocode.
We find that the CodeT5 model gives the best results in terms of BLEU score when trained separately on the two subtasks mentioned above.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The conversion of user epics or stories into their appropriate representation
in pseudocode or code is a time-consuming task, which can take up a large
portion of the time in an industrial project. With this research paper, we aim
to present a methodology to generate pseudocode from a given agile user story
of small functionalities so as to reduce the overall time spent on the
industrial project. Pseudocode is a programming language agnostic
representation of the steps involved in a computer program, which can be easily
converted into any programming language. Leveraging the potential of Natural
Language Processing, we want to simplify the development process in
organizations that use the Agile Model of Software Development. We present a
methodology to convert a problem described in the English language into
pseudocode. This methodology divides the Text to Pseudocode conversion task
into two stages or subtasks, each of which is treated like an individual
machine translation task. Stage 1 is Text to Code Conversion and Stage 2 is
Code to Pseudocode Conversion. We find that the CodeT5 model gives the best
results in terms of BLEU score when trained separately on the two subtasks
mentioned above. BLEU score is a metric that is used to measure the similarity
between a machine-translated text and a set of reference translations.
Related papers
- Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines [0.0]
We introduce a dynamic pipeline that transforms natural language task descriptions into code through high-level data-shaping instructions.
This paper details the fine-tuning process, and sheds light on how natural language descriptions can be translated into functional code.
We propose an algorithm capable of transforming a natural description of an ML task into code with minimal human interaction.
arXiv Detail & Related papers (2024-03-18T08:58:47Z) - SparseCoder: Identifier-Aware Sparse Transformer for File-Level Code
Summarization [51.67317895094664]
This paper studies file-level code summarization, which can assist programmers in understanding and maintaining large source code projects.
We propose SparseCoder, an identifier-aware sparse transformer for effectively handling long code sequences.
arXiv Detail & Related papers (2024-01-26T09:23:27Z) - TransformCode: A Contrastive Learning Framework for Code Embedding via Subtree Transformation [9.477734501499274]
We present TransformCode, a novel framework that learns code embeddings in a contrastive learning manner.
Our framework is encoder-agnostic and language-agnostic, which means that it can leverage any encoder model and handle any programming language.
arXiv Detail & Related papers (2023-11-10T09:05:23Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - InterCode: Standardizing and Benchmarking Interactive Coding with
Execution Feedback [50.725076393314964]
We introduce InterCode, a lightweight, flexible, and easy-to-use framework of interactive coding as a standard reinforcement learning environment.
Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution.
We demonstrate InterCode's viability as a testbed by evaluating multiple state-of-the-art LLMs configured with different prompting strategies.
arXiv Detail & Related papers (2023-06-26T17:59:50Z) - TransCoder: Towards Unified Transferable Code Representation Learning Inspired by Human Skills [31.75121546422898]
We present TransCoder, a unified Transferable fine-tuning strategy for Code representation learning.
We employ a tunable prefix encoder as the meta-learner to capture cross-task and cross-language transferable knowledge.
Our method can lead to superior performance on various code-related tasks and encourage mutual reinforcement.
arXiv Detail & Related papers (2023-05-23T06:59:22Z) - Knowledge Transfer for Pseudo-code Generation from Low Resource
Programming Language [13.716669765394293]
We focus on transferring the knowledge acquired by the code-to-pseudocode neural model trained on a high resource PL (C++) using parallel code-pseudocode data.
We observe an improvement of 23.27% in the success rate of the generated C codes through back translation.
arXiv Detail & Related papers (2023-03-16T03:38:08Z) - Planning with Large Language Models for Code Generation [100.07232672883897]
Planning-Guided Transformer Decoding (PG-TD) uses a planning algorithm to do lookahead search and guide the Transformer to generate better programs.
We empirically evaluate our framework with several large language models as backbones on public coding challenge benchmarks.
arXiv Detail & Related papers (2023-03-09T18:59:47Z) - Using Document Similarity Methods to create Parallel Datasets for Code
Translation [60.36392618065203]
Translating source code from one programming language to another is a critical, time-consuming task.
We propose to use document similarity methods to create noisy parallel datasets of code.
We show that these models perform comparably to models trained on ground truth for reasonable levels of noise.
arXiv Detail & Related papers (2021-10-11T17:07:58Z) - Thinking Like Transformers [64.96770952820691]
We propose a computational model for the transformer-encoder in the form of a programming language.
We show how RASP can be used to program solutions to tasks that could conceivably be learned by a Transformer.
We provide RASP programs for histograms, sorting, and Dyck-languages.
arXiv Detail & Related papers (2021-06-13T13:04:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.