Can Programming Languages Boost Each Other via Instruction Tuning?
- URL: http://arxiv.org/abs/2308.16824v2
- Date: Sun, 3 Sep 2023 08:30:29 GMT
- Title: Can Programming Languages Boost Each Other via Instruction Tuning?
- Authors: Daoguang Zan, Ailun Yu, Bo Shen, Jiaxin Zhang, Taihong Chen, Bing
Geng, Bei Chen, Jichuan Ji, Yafen Yao, Yongji Wang, Qianxiang Wang
- Abstract summary: We conduct experiments of 8 popular programming languages (Python, JavaScript, TypeScript, C, C++, Java, Go, HTML) on StarCoder.
Results demonstrate that programming languages can significantly improve each other.
- Score: 31.22288649229532
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When human programmers have mastered a programming language, it would be
easier when they learn a new programming language. In this report, we focus on
exploring whether programming languages can boost each other during the
instruction fine-tuning phase of code large language models. We conduct
extensive experiments of 8 popular programming languages (Python, JavaScript,
TypeScript, C, C++, Java, Go, HTML) on StarCoder. Results demonstrate that
programming languages can significantly improve each other. For example,
CodeM-Python 15B trained on Python is able to increase Java by an absolute
17.95% pass@1 on HumanEval-X. More surprisingly, we found that CodeM-HTML 7B
trained on the HTML corpus can improve Java by an absolute 15.24% pass@1. Our
training data is released at https://github.com/NL2Code/CodeM.
Related papers
- CodeGRAG: Extracting Composed Syntax Graphs for Retrieval Augmented Cross-Lingual Code Generation [60.799992690487336]
We propose Syntax Graph Retrieval Augmented Code Generation (CodeGRAG) to enhance the performance of LLMs in single-round code generation tasks.
CodeGRAG significantly improves the code generation ability of LLMs and can even offer performance gain for cross-lingual code generation.
arXiv Detail & Related papers (2024-05-03T02:48:55Z) - AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation [14.831115535710692]
We implement the first AI-oriented grammar for Python, named Simple Python (SimPy)
Compared with the original Python, SimPy not only reduces token usage by 13.5% and 10.4% for CodeLlama and GPT-4, but can also achieve equivalent, even improved, performance over the models trained on Python code.
arXiv Detail & Related papers (2024-04-25T04:46:02Z) - SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code
Translation [1.7183449183902841]
We introduce SteloCoder, a decoder-only StarCoder-based system for language-to-Python code translation.
SteloCoder achieves C++, C#, JavaScript, Java, or PHP-to-Python code translation without specifying the input programming language.
With experiments on XLCoST, SteloCoder achieves an average of 73.76 CodeBLEU score in multi-programming language-to-Python translation.
arXiv Detail & Related papers (2023-10-24T06:04:28Z) - Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning [84.12154024070024]
We propose natural language embedded programs (NLEP) as a unifying framework for addressing math/symbolic reasoning, natural language understanding, and instruction following tasks.
Our approach prompts a language model to generate full Python programs that define functions over data structures which contain natural language representations of structured knowledge.
A Python interpreter then executes the generated code and prints the output.
arXiv Detail & Related papers (2023-09-19T17:54:21Z) - CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X [50.008474888951525]
We introduce CodeGeeX, a multilingual model with 13 billion parameters for code generation.
CodeGeeX is pre-trained on 850 billion tokens of 23 programming languages.
arXiv Detail & Related papers (2023-03-30T17:34:01Z) - A Scalable and Extensible Approach to Benchmarking NL2Code for 18
Programming Languages [1.6312827172331896]
We propose MultiPL-E, the first multi-language parallel benchmark for natural-language-to-code-generation.
We evaluate two state-of-the-art code generation models on MultiPL-E: Codex and InCoder.
The range of programming languages represented in MultiPL-E allow us to explore the impact of language frequency and language features on model performance.
arXiv Detail & Related papers (2022-08-17T11:16:52Z) - Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection.
We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z) - Lyra: A Benchmark for Turducken-Style Code Generation [15.810088578588028]
In software development, one programming language is often embedded in another.
This paper defines a new code generation task: given a natural language comment, this task aims to generate a program in a base language with an embedded language.
To our knowledge, this is the first turducken-style code generation task.
arXiv Detail & Related papers (2021-08-27T07:22:55Z) - AVATAR: A Parallel Corpus for Java-Python Program Translation [77.86173793901139]
Program translation refers to migrating source code from one language to another.
We present AVATAR, a collection of 9,515 programming problems and their solutions written in two popular languages, Java and Python.
arXiv Detail & Related papers (2021-08-26T05:44:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.