Exploring the Curious Case of Code Prompts
- URL: http://arxiv.org/abs/2304.13250v1
- Date: Wed, 26 Apr 2023 02:37:52 GMT
- Title: Exploring the Curious Case of Code Prompts
- Authors: Li Zhang, Liam Dugan, Hainiu Xu, Chris Callison-Burch
- Abstract summary: We compare code and text prompts across three popular GPT models (davinci, code-davinci, and text-davinci) on a broader selection of tasks.
We show that the style of code prompt has a large effect on performance for some but not all tasks and that fine-tuning on text instructions leads to better relative performance of code prompts.
- Score: 22.333434626182257
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent work has shown that prompting language models with code-like
representations of natural language leads to performance improvements on
structured reasoning tasks. However, such tasks comprise only a small subset of
all natural language tasks. In our work, we seek to answer whether or not
code-prompting is the preferred way of interacting with language models in
general. We compare code and text prompts across three popular GPT models
(davinci, code-davinci-002, and text-davinci-002) on a broader selection of
tasks (e.g., QA, sentiment, summarization) and find that with few exceptions,
code prompts do not consistently outperform text prompts. Furthermore, we show
that the style of code prompt has a large effect on performance for some but
not all tasks and that fine-tuning on text instructions leads to better
relative performance of code prompts.
Related papers
- NoviCode: Generating Programs from Natural Language Utterances by Novices [59.71218039095155]
We present NoviCode, a novel NL Programming task which takes as input an API and a natural language description by a novice non-programmer.
We show that NoviCode is indeed a challenging task in the code synthesis domain, and that generating complex code from non-technical instructions goes beyond the current Text-to-Code paradigm.
arXiv Detail & Related papers (2024-07-15T11:26:03Z) - Code-Switched Language Identification is Harder Than You Think [69.63439391717691]
Code switching is a common phenomenon in written and spoken communication.
We look at the application of building CS corpora.
We make the task more realistic by scaling it to more languages.
We reformulate the task as a sentence-level multi-label tagging problem to make it more tractable.
arXiv Detail & Related papers (2024-02-02T15:38:47Z) - Code Representation Pre-training with Complements from Program
Executions [29.148208436656216]
We propose FuzzPretrain to explore the dynamic information of programs revealed by their test cases and embed it into the feature representations of code as complements.
FuzzyPretrain yielded more than 6%/9% mAP improvements on code search over its counterparts trained with only source code or AST.
arXiv Detail & Related papers (2023-09-04T01:57:22Z) - Code Prompting: a Neural Symbolic Method for Complex Reasoning in Large
Language Models [74.95486528482327]
We explore code prompting, a neural symbolic prompting method with both zero-shot and few-shot versions which triggers code as intermediate steps.
We conduct experiments on 7 widely-used benchmarks involving symbolic reasoning and arithmetic reasoning.
arXiv Detail & Related papers (2023-05-29T15:14:09Z) - Prompting with Pseudo-Code Instructions [12.166296720125187]
We create a dataset of pseudo-code prompts for 132 different tasks spanning classification, QA and generative language tasks.
Using these prompts along with their counterparts in natural language, we study their performance on two LLM families - BLOOM and CodeGen.
Our experiments show that using pseudo-code instructions leads to better results, with an average increase (absolute) of 7-16 points in F1 scores for classification tasks and an improvement (relative) of 12-38% in aggregate ROUGE-L scores across all tasks.
arXiv Detail & Related papers (2023-05-19T16:25:01Z) - Python Code Generation by Asking Clarification Questions [57.63906360576212]
In this work, we introduce a novel and more realistic setup for this task.
We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions.
We collect and introduce a new dataset named CodeClarQA containing pairs of natural language descriptions and code with created synthetic clarification questions and answers.
arXiv Detail & Related papers (2022-12-19T22:08:36Z) - Demystifying Prompts in Language Models via Perplexity Estimation [109.59105230163041]
Performance of a prompt is coupled with the extent to which the model is familiar with the language it contains.
We show that the lower the perplexity of the prompt is, the better the prompt is able to perform the task.
arXiv Detail & Related papers (2022-12-08T02:21:47Z) - Enhancing Semantic Code Search with Multimodal Contrastive Learning and
Soft Data Augmentation [50.14232079160476]
We propose a new approach with multimodal contrastive learning and soft data augmentation for code search.
We conduct extensive experiments to evaluate the effectiveness of our approach on a large-scale dataset with six programming languages.
arXiv Detail & Related papers (2022-04-07T08:49:27Z) - Can Machines Read Coding Manuals Yet? -- A Benchmark for Building Better
Language Models for Code Understanding [3.98345038769576]
We derive a set of benchmarks that assess code understanding based on tasks such as predicting the best answer to a question in a forum post.
We evaluate the performance of current state-of-the-art language models on these tasks and show that there is a significant improvement on each task from fine tuning.
arXiv Detail & Related papers (2021-09-15T17:42:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.