Related papers: CoLadder: Supporting Programmers with Hierarchical Code Generation in Multi-Level Abstraction

CoLadder: Supporting Programmers with Hierarchical Code Generation in Multi-Level Abstraction

URL: http://arxiv.org/abs/2310.08699v2
Date: Tue, 26 Dec 2023 14:13:29 GMT
Title: CoLadder: Supporting Programmers with Hierarchical Code Generation in Multi-Level Abstraction
Authors: Ryan Yen, Jiawen Zhu, Sangho Suh, Haijun Xia, Jian Zhao
Abstract summary: CoLadder is a system that supports programmers by facilitating hierarchical task decomposition, direct code segment manipulation, and result evaluation. A user study with 12 experienced programmers showed that CoLadder is effective in helping programmers externalize their problem-solving intentions flexibly.
Score: 16.325032481071997
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Programmers increasingly rely on Large Language Models (LLMs) for code generation. However, misalignment between programmers' goals and generated code complicates the code evaluation process and demands frequent switching between prompt authoring and code evaluation. Yet, current LLM-driven code assistants lack sufficient scaffolding to help programmers format intentions from their overarching goals, a crucial step before translating these intentions into natural language prompts. To address this gap, we adopted an iterative design process to gain insights into programmers' strategies when using LLMs for programming. Building on our findings, we created CoLadder, a system that supports programmers by facilitating hierarchical task decomposition, direct code segment manipulation, and result evaluation during prompt authoring. A user study with 12 experienced programmers showed that CoLadder is effective in helping programmers externalize their problem-solving intentions flexibly, improving their ability to evaluate and modify code across various abstraction levels, from goal to final code implementation.

Related papers

IFEvalCode: Controlled Code Generation [69.28317223249358]
The paper introduces forward and backward constraints generation to improve the instruction-following capabilities of Code LLMs.<n>The authors present IFEvalCode, a multilingual benchmark comprising 1.6K test samples across seven programming languages.
arXiv Detail & Related papers (2025-07-30T08:08:48Z)
Evaluating Programming Language Confusion [6.462594894731934]
Large Language Models for code (Code LLMs) have gained significant traction in software engineering. These models have demonstrated remarkable capabilities in understanding programming concepts, implementing algorithms, and even bridging different programming languages. Despite these advances, Code LLMs often struggle with programming language confusion--producing code in unintended languages.
arXiv Detail & Related papers (2025-03-17T18:14:15Z)
Code to Think, Think to Code: A Survey on Code-Enhanced Reasoning and Reasoning-Driven Code Intelligence in LLMs [53.00384299879513]
In large language models (LLMs), code and reasoning reinforce each other. Code provides verifiable execution paths, enforces logical decomposition, and enables runtime validation. We identify key challenges and propose future research directions to strengthen this synergy.
arXiv Detail & Related papers (2025-02-26T18:55:42Z)
Pragmatic Reasoning improves LLM Code Generation [35.78260347663757]
We propose CodeRSA, a novel code candidate reranking mechanism built upon the Rational Speech Act (RSA) framework. We evaluate CodeRSA using one of the latest Large Language Models on a popular code generation dataset.
arXiv Detail & Related papers (2025-02-20T12:44:26Z)
ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models [49.04652315815501]
Tool learning has emerged as a crucial capability for large language models (LLMs) to solve complex real-world tasks through interaction with external tools. We propose ToolCoder, a novel framework that reformulates tool learning as a code generation task.
arXiv Detail & Related papers (2025-02-17T03:42:28Z)
Oracular Programming: A Modular Foundation for Building LLM-Enabled Software [5.294604210205507]
Large Language Models have proved surprisingly effective at solving a wide range of tasks from just a handful of examples. Their lack of reliability and modularity limits their capacity to tackle large problems that require many steps of reasoning. We propose oracular programming, a foundational paradigm for building LLM-enabled applications that lets domain experts express high-level problem-solving strategies.
arXiv Detail & Related papers (2025-02-07T20:24:43Z)
A Pair Programming Framework for Code Generation via Multi-Plan Exploration and Feedback-Driven Refinement [24.25119206488625]
PairCoder is a novel framework for large language models (LLMs) to generate code. It incorporates two collaborative agents, namely a Navigator agent for high-level planning and a Driver agent for specific implementation. The Driver follows the guidance of Navigator to undertake initial code generation, code testing, and refinement.
arXiv Detail & Related papers (2024-09-08T07:22:19Z)
No Man is an Island: Towards Fully Automatic Programming by Code Search, Code Generation and Program Repair [9.562123938545522]
toolname can integrate various code search, generation, and repair tools, combining these three research areas together for the first time. We conduct preliminary experiments to demonstrate the potential of our framework, eg helping CodeLlama solve 267 programming problems with an improvement of 62.53%.
arXiv Detail & Related papers (2024-09-05T06:24:29Z)
Sifting through the Chaff: On Utilizing Execution Feedback for Ranking the Generated Code Candidates [46.74037090843497]
Large Language Models (LLMs) are transforming the way developers approach programming by automatically generating code based on natural language descriptions. This paper puts forward RankEF, an innovative approach for code ranking that leverages execution feedback. Experiments on three code generation benchmarks demonstrate that RankEF significantly outperforms the state-of-the-art CodeRanker.
arXiv Detail & Related papers (2024-08-26T01:48:57Z)
NoviCode: Generating Programs from Natural Language Utterances by Novices [59.71218039095155]
We present NoviCode, a novel NL Programming task which takes as input an API and a natural language description by a novice non-programmer. We show that NoviCode is indeed a challenging task in the code synthesis domain, and that generating complex code from non-technical instructions goes beyond the current Text-to-Code paradigm.
arXiv Detail & Related papers (2024-07-15T11:26:03Z)
MTLLM: LLMs are Meaning-Typed Code Constructs [7.749453456370407]
This paper presents a simplified approach to integrating large language models (LLMs) into programming. Our approach utilizes the semantic richness in existing programs to automatically translate between the traditional programming languages and the natural language. We present a fully functional and production-grade implementation for our approach and compare it to SOTA LLM software development tools.
arXiv Detail & Related papers (2024-05-14T21:12:01Z)
CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs. CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language. Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z)
Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective [85.48043537327258]
We propose MANGO (comMents As Natural loGic pivOts), including a comment contrastive training strategy and a corresponding logical comment decoding strategy. Results indicate that MANGO significantly improves the code pass rate based on the strong baselines. The robustness of the logical comment decoding strategy is notably higher than the Chain-of-thoughts prompting.
arXiv Detail & Related papers (2024-04-11T08:30:46Z)
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code) Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z)
Function-constrained Program Synthesis [12.55507214959886]
Large language models (LLMs) can generate code in real-time by drawing on all code available in a development environment. Current systems lack effective recovery methods, forcing users to iteratively re-prompt the model with modified prompts until a sufficient solution is reached. Our method constrains code-generation to an explicit function set and enabling recovery from failed attempts through automatically generated sub-functions.
arXiv Detail & Related papers (2023-11-27T02:55:34Z)
When Do Program-of-Thoughts Work for Reasoning? [51.2699797837818]
We propose complexity-impacted reasoning score (CIRS) to measure correlation between code and reasoning abilities. Specifically, we use the abstract syntax tree to encode the structural information and calculate logical complexity. Code will be integrated into the EasyInstruct framework at https://github.com/zjunlp/EasyInstruct.
arXiv Detail & Related papers (2023-08-29T17:22:39Z)
ReACC: A Retrieval-Augmented Code Completion Framework [53.49707123661763]
We propose a retrieval-augmented code completion framework, leveraging both lexical copying and referring to code with similar semantics by retrieval. We evaluate our approach in the code completion task in Python and Java programming languages, achieving a state-of-the-art performance on CodeXGLUE benchmark.
arXiv Detail & Related papers (2022-03-15T08:25:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.