The potential of LLMs for coding with low-resource and domain-specific
programming languages
- URL: http://arxiv.org/abs/2307.13018v1
- Date: Mon, 24 Jul 2023 17:17:13 GMT
- Title: The potential of LLMs for coding with low-resource and domain-specific
programming languages
- Authors: Artur Tarassow
- Abstract summary: This study focuses on the econometric scripting language named hansl of the open-source software gretl.
Our findings suggest that LLMs can be a useful tool for writing, understanding, improving, and documenting gretl code.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a study on the feasibility of using large language models
(LLM) for coding with low-resource and domain-specific programming languages
that typically lack the amount of data required for effective LLM processing
techniques. This study focuses on the econometric scripting language named
hansl of the open-source software gretl and employs a proprietary LLM based on
GPT-3.5. Our findings suggest that LLMs can be a useful tool for writing,
understanding, improving, and documenting gretl code, which includes generating
descriptive docstrings for functions and providing precise explanations for
abstract and poorly documented econometric code. While the LLM showcased
promoting docstring-to-code translation capability, we also identify some
limitations, such as its inability to improve certain sections of code and to
write accurate unit tests. This study is a step towards leveraging the power of
LLMs to facilitate software development in low-resource programming languages
and ultimately to lower barriers to entry for their adoption.
Related papers
- OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models [70.72097493954067]
Large language models (LLMs) for code have become indispensable in various domains, including code generation, reasoning tasks and agent systems.
While open-access code LLMs are increasingly approaching the performance levels of proprietary models, high-quality code LLMs remain limited.
We introduce OpenCoder, a top-tier code LLM that not only achieves performance comparable to leading models but also serves as an "open cookbook" for the research community.
arXiv Detail & Related papers (2024-11-07T17:47:25Z) - Crystal: Illuminating LLM Abilities on Language and Code [58.5467653736537]
We propose a pretraining strategy to enhance the integration of natural language and coding capabilities.
The resulting model, Crystal, demonstrates remarkable capabilities in both domains.
arXiv Detail & Related papers (2024-11-06T10:28:46Z) - Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights [9.414198519543564]
We present codellm-devkit (hereafter, CLDK'), an open-source library that significantly simplifies the process of performing program analysis.
CLDK offers developers an intuitive and user-friendly interface, making it incredibly easy to provide rich program analysis context to code LLMs.
arXiv Detail & Related papers (2024-10-16T20:05:59Z) - Studying and Benchmarking Large Language Models For Log Level Suggestion [49.176736212364496]
Large Language Models (LLMs) have become a focal point of research across various domains.
This paper investigates the impact of characteristics and learning paradigms on the performance of 12 open-source LLMs in log level suggestion.
arXiv Detail & Related papers (2024-10-11T03:52:17Z) - CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs.
CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language.
Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z) - Perplexed: Understanding When Large Language Models are Confused [3.4208414448496027]
This paper introduces perplexed, a library for exploring where a language model is perplexed.
We conducted a case study focused on Large Language Models (LLMs) for code generation using an additional tool we built to help with the analysis of code models called codetokenizer.
We found that our studied code LLMs had their worst performance on coding structures where the code was not syntactically correct.
arXiv Detail & Related papers (2024-04-09T22:03:39Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - Large Language Models for Code Analysis: Do LLMs Really Do Their Job? [13.48555476110316]
Large language models (LLMs) have demonstrated significant potential in the realm of natural language understanding and programming code processing tasks.
This paper offers a comprehensive evaluation of LLMs' capabilities in performing code analysis tasks.
arXiv Detail & Related papers (2023-10-18T22:02:43Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.