Fugu-MT 論文翻訳(概要): Evaluating LLM-Based Goal Extraction in Requirements Engineering: Prompting Strategies and Their Limitations

論文の概要: Evaluating LLM-Based Goal Extraction in Requirements Engineering: Prompting Strategies and Their Limitations

arxiv url: http://arxiv.org/abs/2604.22207v1
Date: Fri, 24 Apr 2026 04:22:17 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-27 15:36:26.337928
Title: Evaluating LLM-Based Goal Extraction in Requirements Engineering: Prompting Strategies and Their Limitations
Title（参考訳）: 要求工学におけるLCMに基づくゴール抽出の評価:提案戦略と限界
Authors: Anna Arnaudo, Riccardo Coppola, Maurizio Morisio, Flavio Giobergia, Andrea Bioddo, Angelo Bongiorno, Luca Dadone,
Abstract要約: 本稿では,Goal-Oriented Requirements Engineering(GORE)プロセスを自動化するための,ソフトウェアドキュメントから機能目標を抽出するアプローチについて議論する。これらの機能を実装するために,工学的なプロンプトを組み込んだ大規模言語モデルの連鎖を提案する。パイプラインは最終段階である低レベルゴール識別において61%の精度を達成したが、これらの結果は手動抽出を高速化するツールとして最適であることを示している。
参考スコア（独自算出の注目度）: 4.451267761568192
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Due to the textual and repetitive nature of many Requirements Engineering (RE) artefacts, Large Language Models (LLMs) have proven useful to automate their generation and processing. In this paper, we discuss a possible approach for automating the Goal-Oriented Requirements Engineering (GORE) process by extracting functional goals from software documentation through three phases: actor identification, high and low-level goal extraction. To implement these functionalities, we propose a chain of LLMs fed with engineered prompts. We experimented with different variants of in-context learning and measured the similarities between input data and in-context examples to better investigate their impact. Another key element is the generation-critic mechanism, implemented as a feedback loop involving two LLMs. Although the pipeline achieved 61% accuracy in low-level goal identification, the final stage, these results indicate the approach is best suited as a tool to accelerate manual extraction rather than as a full replacement. The feedback-loop mechanism with Zero-shot outperformed stand-alone Few-shot, with an ablation study suggesting that performance slightly degrades without the feedback cycle. However, we reported that the combination of the feedback mechanism with Few-shot does not deliver any advantage, possibly suggesting that the primary performance ceiling is the prompting strategy applied to the 'critic' LLM. Together with the refinement of both the quantity and quality of the Shot examples, future research will integrate Retrieval-Augmented Generation (RAG) and Chain-of-Thought (CoT) prompting to improve accuracy.
Abstract（参考訳）: 多くの要求工学(RE)アーティファクトのテキスト的かつ反復的な性質のため、LLM(Large Language Models)はその生成と処理を自動化するのに有用であることが証明されている。本稿では,ソフトウェアドキュメンテーションから,アクター識別,ハイレベル,ローレベルな目標抽出という3つのフェーズを通じて機能目標を抽出することで,ゴール指向要求工学(GORE)プロセスを自動化するためのアプローチについて議論する。これらの機能を実装するために,設計したプロンプトを組み込んだLLMの連鎖を提案する。インコンテキスト学習の異なるバリエーションを実験し、インコンテキストデータとインコンテキストサンプルの類似性を測定し、その影響をよりよく調査した。もう一つの重要な要素はジェネレーションクリティカル機構であり、2つのLLMを含むフィードバックループとして実装されている。パイプラインは最終段階である低レベルゴール識別において61%の精度を達成したが、これらの結果は、完全な代替品としてではなく、手動抽出を高速化するツールとして最適であることを示している。 Zero-shotによるフィードバックループ機構は、スタンドアローンのFew-shotよりも優れており、アブレーション研究では、フィードバックサイクルなしでパフォーマンスがわずかに低下することを示している。しかし, フィードバック機構とFew-shotの組み合わせは, いずれの利点も得られず, 主要な性能天井が「批判的」 LLM に適用されるプロンプト戦略である可能性が示唆された。ショットサンプルの量と品質の両面での改善とともに、今後の研究は、精度の向上を推進すべく、検索補助生成(RAG)とCoT(Chain-of-Thought)を統合する。

論文の概要: Evaluating LLM-Based Goal Extraction in Requirements Engineering: Prompting Strategies and Their Limitations

関連論文リスト