Related papers: Can Programming Languages Boost Each Other via Instruction Tuning?

Can Programming Languages Boost Each Other via Instruction Tuning?

URL: http://arxiv.org/abs/2308.16824v2
Date: Sun, 3 Sep 2023 08:30:29 GMT
Title: Can Programming Languages Boost Each Other via Instruction Tuning?
Authors: Daoguang Zan, Ailun Yu, Bo Shen, Jiaxin Zhang, Taihong Chen, Bing Geng, Bei Chen, Jichuan Ji, Yafen Yao, Yongji Wang, Qianxiang Wang
Abstract summary: We conduct experiments of 8 popular programming languages (Python, JavaScript, TypeScript, C, C++, Java, Go, HTML) on StarCoder. Results demonstrate that programming languages can significantly improve each other.
Score: 31.22288649229532
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When human programmers have mastered a programming language, it would be easier when they learn a new programming language. In this report, we focus on exploring whether programming languages can boost each other during the instruction fine-tuning phase of code large language models. We conduct extensive experiments of 8 popular programming languages (Python, JavaScript, TypeScript, C, C++, Java, Go, HTML) on StarCoder. Results demonstrate that programming languages can significantly improve each other. For example, CodeM-Python 15B trained on Python is able to increase Java by an absolute 17.95% pass@1 on HumanEval-X. More surprisingly, we found that CodeM-HTML 7B trained on the HTML corpus can improve Java by an absolute 15.24% pass@1. Our training data is released at https://github.com/NL2Code/CodeM.

Related papers

CodeGRAG: Extracting Composed Syntax Graphs for Retrieval Augmented Cross-Lingual Code Generation [60.799992690487336]
We propose Syntax Graph Retrieval Augmented Code Generation (CodeGRAG) to enhance the performance of LLMs in single-round code generation tasks. CodeGRAG significantly improves the code generation ability of LLMs and can even offer performance gain for cross-lingual code generation.
arXiv Detail & Related papers (2024-05-03T02:48:55Z)
AI Coders Are Among Us: Rethinking Programming Language Grammar Towards Efficient Code Generation [14.831115535710692]
We implement the first AI-oriented grammar for Python, named Simple Python (SimPy) Compared with the original Python, SimPy not only reduces token usage by 13.5% and 10.4% for CodeLlama and GPT-4, but can also achieve equivalent, even improved, performance over the models trained on Python code.
arXiv Detail & Related papers (2024-04-25T04:46:02Z)
SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code Translation [1.7183449183902841]
We introduce SteloCoder, a decoder-only StarCoder-based system for language-to-Python code translation. SteloCoder achieves C++, C#, JavaScript, Java, or PHP-to-Python code translation without specifying the input programming language. With experiments on XLCoST, SteloCoder achieves an average of 73.76 CodeBLEU score in multi-programming language-to-Python translation.
arXiv Detail & Related papers (2023-10-24T06:04:28Z)
Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning [84.12154024070024]
We propose natural language embedded programs (NLEP) as a unifying framework for addressing math/symbolic reasoning, natural language understanding, and instruction following tasks. Our approach prompts a language model to generate full Python programs that define functions over data structures which contain natural language representations of structured knowledge. A Python interpreter then executes the generated code and prints the output.
arXiv Detail & Related papers (2023-09-19T17:54:21Z)
CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Benchmarking on HumanEval-X [50.008474888951525]
We introduce CodeGeeX, a multilingual model with 13 billion parameters for code generation. CodeGeeX is pre-trained on 850 billion tokens of 23 programming languages.
arXiv Detail & Related papers (2023-03-30T17:34:01Z)
A Scalable and Extensible Approach to Benchmarking NL2Code for 18 Programming Languages [1.6312827172331896]
We propose MultiPL-E, the first multi-language parallel benchmark for natural-language-to-code-generation. We evaluate two state-of-the-art code generation models on MultiPL-E: Codex and InCoder. The range of programming languages represented in MultiPL-E allow us to explore the impact of language frequency and language features on model performance.
arXiv Detail & Related papers (2022-08-17T11:16:52Z)
Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection. We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z)
Lyra: A Benchmark for Turducken-Style Code Generation [15.810088578588028]
In software development, one programming language is often embedded in another. This paper defines a new code generation task: given a natural language comment, this task aims to generate a program in a base language with an embedded language. To our knowledge, this is the first turducken-style code generation task.
arXiv Detail & Related papers (2021-08-27T07:22:55Z)
AVATAR: A Parallel Corpus for Java-Python Program Translation [77.86173793901139]
Program translation refers to migrating source code from one language to another. We present AVATAR, a collection of 9,515 programming problems and their solutions written in two popular languages, Java and Python.
arXiv Detail & Related papers (2021-08-26T05:44:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.