Related papers: Spec-Driven Development:From Code to Contract in the Age of AI Coding Assistants

Spec-Driven Development:From Code to Contract in the Age of AI Coding Assistants

URL: http://arxiv.org/abs/2602.00180v1
Date: Fri, 30 Jan 2026 04:45:42 GMT
Title: Spec-Driven Development:From Code to Contract in the Age of AI Coding Assistants
Authors: Deepak Babu Piskala,
Abstract summary: Spec-driven development (SDD) treats specifications as the source of truth and code as a generated or verified secondary artifact.<n>We present three levels of specification rigor-spec-first, spec-anchored, and spec-as-source-with clear guidance on when each applies.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rise of AI coding assistants has reignited interest in an old idea: what if specifications-not code-were the primary artifact of software development? Spec-driven development (SDD) inverts the traditional workflow by treating specifications as the source of truth and code as a generated or verified secondary artifact. This paper provides practitioners with a comprehensive guide to SDD, covering its principles, workflow patterns, and supporting tools. We present three levels of specification rigor-spec-first, spec-anchored, and spec-as-source-with clear guidance on when each applies. Through analysis of tools ranging from Behavior-Driven Development frameworks to modern AI-assisted toolkits like GitHub Spec Kit, we demonstrate how the spec-first philosophy maps to real implementations. We present case studies from API development, enterprise systems, and embedded software, illustrating how different domains apply SDD. We conclude with a decision framework helping practitioners determine when SDD provides value and when simpler approaches suffice.

Related papers

Guidelines to Prompt Large Language Models for Code Generation: An Empirical Characterization [82.29178197694819]
We derive and evaluate development-specific prompt optimization guidelines.<n>We use an iterative, test-driven approach to automatically refine code generation prompts.<n>We conduct an assessment with 50 practitioners, who report their usage of the elicited prompt improvement patterns.
arXiv Detail & Related papers (2026-01-19T15:01:42Z)
ABC-Bench: Benchmarking Agentic Backend Coding in Real-World Development [72.4729759618632]
We introduce ABC-Bench, a benchmark to evaluate agentic backend coding within a realistic, executable workflow.<n>We curated 224 practical tasks spanning 8 languages and 19 frameworks from open-source repositories.<n>Our evaluation reveals that even state-of-the-art models struggle to deliver reliable performance on these holistic tasks.
arXiv Detail & Related papers (2026-01-16T08:23:52Z)
An Empirical Study of Developer-Provided Context for AI Coding Assistants in Open-Source Projects [2.392035679895744]
This paper presents a large-scale empirical study to characterize the emerging form of developer-provided context.<n>We developed a comprehensive taxonomy of project context that developers consider essential, organized into five high-level themes.<n>Our study also explores how this context varies across different project types and programming languages.
arXiv Detail & Related papers (2025-12-21T23:51:02Z)
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence [150.3696990310269]
Large language models (LLMs) have transformed automated software development by enabling direct translation of natural language descriptions into functional code.<n>We provide a comprehensive synthesis and practical guide (a series of analytic and probing experiments) about code LLMs.<n>We analyze the code capability of the general LLMs (GPT-4, Claude, LLaMA) and code-specialized LLMs (StarCoder, Code LLaMA, DeepSeek-Coder, and QwenCoder)
arXiv Detail & Related papers (2025-11-23T17:09:34Z)
Context Engineering for AI Agents in Open-Source Software [13.236926479239754]
GenAI-based coding assistants have disrupted software development.<n>Their next generation is agent-based, operating with more autonomy and potentially without human oversight.<n>One challenge is to provide AI agents with sufficient context about the software projects they operate in.
arXiv Detail & Related papers (2025-10-24T12:55:48Z)
A Survey on Code Generation with LLM-based Agents [61.474191493322415]
Code generation agents powered by large language models (LLMs) are revolutionizing the software development paradigm.<n>LLMs are characterized by three core features.<n>This paper presents a systematic survey of the field of LLM-based code generation agents.
arXiv Detail & Related papers (2025-07-31T18:17:36Z)
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework. Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z)
CodeRAG-Bench: Can Retrieval Augment Code Generation? [78.37076502395699]
We conduct a systematic, large-scale analysis of code generation using retrieval-augmented generation.<n>We first curate a comprehensive evaluation benchmark, CodeRAG-Bench, encompassing three categories of code generation tasks.<n>We examine top-performing models on CodeRAG-Bench by providing contexts retrieved from one or multiple sources.
arXiv Detail & Related papers (2024-06-20T16:59:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.