Think Like an Engineer: A Neuro-Symbolic Collaboration Agent for Generative Software Requirements Elicitation and Self-Review
- URL: http://arxiv.org/abs/2507.14969v1
- Date: Sun, 20 Jul 2025 13:59:00 GMT
- Title: Think Like an Engineer: A Neuro-Symbolic Collaboration Agent for Generative Software Requirements Elicitation and Self-Review
- Authors: Sai Zhang, Zhenchang Xing, Jieshan Chen, Dehai Zhao, Zizhong Zhu, Xiaowang Zhang, Zhiyong Feng, Xiaohong Li,
- Abstract summary: This paper introduces RequireCEG, a requirement elicitation and self-review agent that embeds causal-effect graphs (CEGs) in a neuro-symbolic collaboration architecture.<n>To evaluate our method, we created the RGPair benchmark dataset and conducted extensive experiments.
- Score: 23.26988707110507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The vision of End-User Software Engineering (EUSE) is to empower non-professional users with full control over the software development lifecycle. It aims to enable users to drive generative software development using only natural language requirements. However, since end-users often lack knowledge of software engineering, their requirement descriptions are frequently ambiguous, raising significant challenges to generative software development. Although existing approaches utilize structured languages like Gherkin to clarify user narratives, they still struggle to express the causal logic between preconditions and behavior actions. This paper introduces RequireCEG, a requirement elicitation and self-review agent that embeds causal-effect graphs (CEGs) in a neuro-symbolic collaboration architecture. RequireCEG first uses a feature tree to analyze user narratives hierarchically, clearly defining the scope of software components and their system behavior requirements. Next, it constructs the self-healing CEGs based on the elicited requirements, capturing the causal relationships between atomic preconditions and behavioral actions. Finally, the constructed CEGs are used to review and optimize Gherkin scenarios, ensuring consistency between the generated Gherkin requirements and the system behavior requirements elicited from user narratives. To evaluate our method, we created the RGPair benchmark dataset and conducted extensive experiments. It achieves an 87% coverage rate and raises diversity by 51.88%.
Related papers
- Assemble Your Crew: Automatic Multi-agent Communication Topology Design via Autoregressive Graph Generation [72.44384066166147]
Multi-agent systems (MAS) based on large language models (LLMs) have emerged as a powerful solution for dealing with complex problems across diverse domains.<n>Existing approaches are fundamentally constrained by their reliance on a template graph modification paradigm with a predefined set of agents and hard-coded interaction structures.<n>We propose ARG-Designer, a novel autoregressive model that operationalizes this paradigm by constructing the collaboration graph from scratch.
arXiv Detail & Related papers (2025-07-24T09:17:41Z) - State and Memory is All You Need for Robust and Reliable AI Agents [29.259008600842517]
Large language models (LLMs) have enabled powerful advances in natural language understanding and generation.<n>Yet their application to complex, real-world scientific remain limited by challenges in memory, planning, and tool integration.<n>Here, we introduce SciBORG, a modular agentic framework that allows LLM-based agents to autonomously plan, reason, and achieve robust and reliable domain-specific task execution.
arXiv Detail & Related papers (2025-06-30T02:02:35Z) - Creating General User Models from Computer Use [62.91116265732001]
This paper presents an architecture for a general user model (GUM) that learns about you by observing any interaction you have with your computer.<n>The GUM takes as input any unstructured observation of a user (e.g., device screenshots) and constructs confidence-weighted propositions that capture user knowledge and preferences.
arXiv Detail & Related papers (2025-05-16T04:00:31Z) - Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs [63.10710876536337]
We propose an offline simulation framework to curate a software-specific skillset, a collection of verified scripts.<n>Our framework comprises two components: (1) task creation, using top-down functionality and bottom-up API synergy exploration to generate helpful tasks.<n> Experiments with Adobe Illustrator demonstrate that our framework significantly improves automation success rates, reduces response time, and saves runtime token costs.
arXiv Detail & Related papers (2025-04-29T04:03:37Z) - Bridging LLM-Generated Code and Requirements: Reverse Generation technique and SBC Metric for Developer Insights [0.0]
This paper introduces a novel scoring mechanism called the SBC score.<n>It is based on a reverse generation technique that leverages the natural language generation capabilities of Large Language Models.<n>Unlike direct code analysis, our approach reconstructs system requirements from AI-generated code and compares them with the original specifications.
arXiv Detail & Related papers (2025-02-11T01:12:11Z) - Gensors: Authoring Personalized Visual Sensors with Multimodal Foundation Models and Reasoning [61.17099595835263]
Gensors is a system that empowers users to define customized sensors supported by the reasoning capabilities of MLLMs.<n>In a user study, participants reported significantly greater sense of control, understanding, and ease of communication when defining sensors using Gensors.
arXiv Detail & Related papers (2025-01-27T01:47:57Z) - Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? [60.84912551069379]
We present the Code-Development Benchmark (Codev-Bench), a fine-grained, real-world, repository-level, and developer-centric evaluation framework.
Codev-Agent is an agent-based system that automates repository crawling, constructs execution environments, extracts dynamic calling chains from existing unit tests, and generates new test samples to avoid data leakage.
arXiv Detail & Related papers (2024-10-02T09:11:10Z) - Empowering Agile-Based Generative Software Development through Human-AI Teamwork [24.743864861980803]
We propose AgileGen, an agile-based generative software development through human-AI teamwork.
A memory pool mechanism is used to collect user decision-making scenarios and recommend them to new users.
arXiv Detail & Related papers (2024-07-22T11:54:44Z) - CoGS: Causality Constrained Counterfactual Explanations using goal-directed ASP [1.5749416770494706]
We present the CoGS (Counterfactual Generation with s(CASP)) framework to generate counterfactuals from rule-based machine learning models.
CoGS computes realistic and causally consistent changes to attribute values taking causal dependencies between them into account.
It finds a path from an undesired outcome to a desired one using counterfactuals.
arXiv Detail & Related papers (2024-07-11T04:50:51Z) - Large Language Models for Power Scheduling: A User-Centric Approach [6.335540414370735]
We introduce a novel architecture for resource scheduling problems by converting an arbitrary user's voice request (VRQ) into a resource allocation vector.
Specifically, we design an LLM intent recognition agent to translate the request into an optimization problem (OP), an LLM OP parameter identification agent, and an OP solving agent.
arXiv Detail & Related papers (2024-06-29T15:47:28Z) - Tell Me More! Towards Implicit User Intention Understanding of Language
Model Driven Agents [110.25679611755962]
Current language model-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions.
We introduce Intention-in-Interaction (IN3), a novel benchmark designed to inspect users' implicit intentions through explicit queries.
We empirically train Mistral-Interact, a powerful model that proactively assesses task vagueness, inquires user intentions, and refines them into actionable goals.
arXiv Detail & Related papers (2024-02-14T14:36:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.