Fugu-MT 論文翻訳(概要): Leveraging Test Driven Development with Large Language Models for Reliable and Verifiable Spreadsheet Code Generation: A Research Framework

論文の概要: Leveraging Test Driven Development with Large Language Models for Reliable and Verifiable Spreadsheet Code Generation: A Research Framework

arxiv url: http://arxiv.org/abs/2510.15585v1
Date: Fri, 17 Oct 2025 12:28:16 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-20 20:17:34.61577
Title: Leveraging Test Driven Development with Large Language Models for Reliable and Verifiable Spreadsheet Code Generation: A Research Framework
Title（参考訳）: 信頼性と検証可能なスプレッドシートコード生成のための大規模言語モデルによるテスト駆動開発を活用する - 研究フレームワーク
Authors: Dr Simon Thorne, Dr Advait Sarkar,
Abstract要約: 本稿では、テスト駆動開発(TDD)の実証済みのソフトウェアエンジニアリングプラクティスとLLM(Large Language Model)による生成を統合する、構造化された研究フレームワークを提案する。テスト駆動思考を強調することで、計算思考の改善、エンジニアリングスキルの促進、ユーザエンゲージメントの実現を目指す。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs), such as ChatGPT, are increasingly leveraged for generating both traditional software code and spreadsheet logic. Despite their impressive generative capabilities, these models frequently exhibit critical issues such as hallucinations, subtle logical inconsistencies, and syntactic errors, risks particularly acute in high stakes domains like financial modelling and scientific computations, where accuracy and reliability are paramount. This position paper proposes a structured research framework that integrates the proven software engineering practice of Test-Driven Development (TDD) with Large Language Model (LLM) driven generation to enhance the correctness of, reliability of, and user confidence in generated outputs. We hypothesise that a "test first" methodology provides both technical constraints and cognitive scaffolding, guiding LLM outputs towards more accurate, verifiable, and comprehensible solutions. Our framework, applicable across diverse programming contexts, from spreadsheet formula generation to scripting languages such as Python and strongly typed languages like Rust, includes an explicitly outlined experimental design with clearly defined participant groups, evaluation metrics, and illustrative TDD based prompting examples. By emphasising test driven thinking, we aim to improve computational thinking, prompt engineering skills, and user engagement, particularly benefiting spreadsheet users who often lack formal programming training yet face serious consequences from logical errors. We invite collaboration to refine and empirically evaluate this approach, ultimately aiming to establish responsible and reliable LLM integration in both educational and professional development practices.
Abstract（参考訳）: ChatGPTのような大規模言語モデル(LLM)は、従来のソフトウェアコードとスプレッドシートロジックの両方を生成するためにますます活用されている。その印象的な生成能力にもかかわらず、これらのモデルは幻覚、微妙な論理的不整合、統語的誤り、特に金融モデリングや科学的計算のような高利害な領域において、正確さと信頼性が最重要であるリスクなどの重要な問題をしばしば示している。本稿では,テスト駆動開発(TDD)の実証済みソフトウェアエンジニアリングプラクティスをLLM(Large Language Model)駆動生成と統合して,生成した出力の正確性,信頼性,ユーザ信頼性を高めるための構造化された研究フレームワークを提案する。我々は、"テストファースト"手法が技術的制約と認知的足場の両方を提供し、LCM出力をより正確で、検証可能で、理解可能なソリューションへと導くと仮定する。私たちのフレームワークは、スプレッドシートの公式生成から、Pythonのようなスクリプト言語やRustのような強く型付けされた言語に至るまで、さまざまなプログラミングコンテキストに適用できます。テスト駆動思考を強調することで、計算思考の改善、エンジニアリングスキルの向上、ユーザエンゲージメント、特にフォーマルなプログラミングトレーニングを欠いているスプレッドシートユーザにとって、論理的エラーによる深刻な結果に直面するメリットを享受することを目指している。我々は、このアプローチを洗練・実証的に評価するために協力を招待し、究極的には、教育と専門的な開発プラクティスの両方において、責任と信頼性のあるLLM統合を確立することを目的としています。

論文の概要: Leveraging Test Driven Development with Large Language Models for Reliable and Verifiable Spreadsheet Code Generation: A Research Framework

関連論文リスト