Related papers: LLM-based agents for automating the enhancement of user story quality: An early report

LLM-based agents for automating the enhancement of user story quality: An early report

URL: http://arxiv.org/abs/2403.09442v1
Date: Thu, 14 Mar 2024 14:35:53 GMT
Title: LLM-based agents for automating the enhancement of user story quality: An early report
Authors: Zheying Zhang, Maruf Rayhan, Tomas Herda, Manuel Goisauf, Pekka Abrahamsson,
Abstract summary: This study explores the use of large language models to improve user story quality in Austrian Post Group IT agile teams. We developed a reference model for an Autonomous LLM-based Agent System and implemented it at the company. The quality of user stories in the study and the effectiveness of these agents for user story quality improvement was assessed by 11 participants across six agile teams.
Score: 2.856781525749652
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In agile software development, maintaining high-quality user stories is crucial, but also challenging. This study explores the use of large language models to automatically improve the user story quality in Austrian Post Group IT agile teams. We developed a reference model for an Autonomous LLM-based Agent System and implemented it at the company. The quality of user stories in the study and the effectiveness of these agents for user story quality improvement was assessed by 11 participants across six agile teams. Our findings demonstrate the potential of LLMs in improving user story quality, contributing to the research on AI role in agile development, and providing a practical example of the transformative impact of AI in an industry setting.

Related papers

Leveraging LLMs for User Stories in AI Systems: UStAI Dataset [0.38233569758620056]
Large Language Models (LLMs) are emerging as a promising alternative to human-generated text. This paper investigates the potential use of LLMs to generate user stories for AI systems based on abstracts from scholarly papers. Our analysis demonstrates that the investigated LLMs can generate user stories inspired by the needs of various stakeholders.
arXiv Detail & Related papers (2025-04-01T08:03:40Z)
The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement [49.687224320842105]
Large language models (LLMs) have recently transformed from text-based assistants to autonomous agents capable of planning, reasoning, and iteratively improving their actions. In this work, we introduce Critique-Guided Improvement (CGI), a novel two-player framework, comprising an actor model that explores an environment and a critic model that generates detailed nature language feedback.
arXiv Detail & Related papers (2025-03-20T10:42:33Z)
USeR: A Web-based User Story eReviewer for Assisted Quality Optimizations [2.746265158172294]
Multiple user story quality guidelines exist, but authors like Product Owners in industry projects frequently fail to write high-quality user stories. This situation is exacerbated by the lack of tools for assessing user story quality. We propose User Story eReviewer (USeR) a web-based tool that allows authors to determine and optimize user story quality.
arXiv Detail & Related papers (2025-03-03T21:02:10Z)
OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas [65.83634577897564]
This study explores a large-scale data synthesis approach to equip large language models with character generalization capabilities. We begin by synthesizing large-scale character profiles using personas from Persona Hub. We then explore two strategies: response rewriting and response generation, to create character-aligned instructional responses.
arXiv Detail & Related papers (2025-01-26T07:07:01Z)
Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning [71.2981957820888]
We propose a novel Star-Agents framework, which automates the enhancement of data quality across datasets. The framework initially generates diverse instruction data with multiple LLM agents through a bespoke sampling method. The generated data undergo a rigorous evaluation using a dual-model method that assesses both difficulty and quality.
arXiv Detail & Related papers (2024-11-21T02:30:53Z)
Assessing the Performance of Human-Capable LLMs -- Are LLMs Coming for Your Job? [0.0]
SelfScore is a benchmark designed to assess the performance of automated Large Language Model (LLM) agents on help desk and professional consultation tasks. The benchmark evaluates agents on problem complexity and response helpfulness, ensuring transparency and simplicity in its scoring system. The study raises concerns about the potential displacement of human workers, especially in areas where AI technologies excel.
arXiv Detail & Related papers (2024-10-05T14:37:35Z)
AI based Multiagent Approach for Requirements Elicitation and Analysis [3.9422957660677476]
This study empirically investigates the effectiveness of utilizing Large Language Models (LLMs) to automate requirements analysis tasks. We deployed four models, namely GPT-3.5, GPT-4 Omni, LLaMA3-70, and Mixtral-8B, and conducted experiments to analyze requirements on four real-world projects. Preliminary results indicate notable variations in task completion among the models.
arXiv Detail & Related papers (2024-08-18T07:23:12Z)
Large Language Models for Base Station Siting: Intelligent Deployment based on Prompt or Agent [62.16747639440893]
Large language models (LLMs) and their associated technologies advance, particularly in the realms of prompt engineering and agent engineering. This approach entails the strategic use of well-crafted prompts to infuse human experience and knowledge into these sophisticated LLMs. This integration represents the future paradigm of artificial intelligence (AI) as a service and AI for more ease.
arXiv Detail & Related papers (2024-08-07T08:43:32Z)
Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs) The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation. We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z)
CMAT: A Multi-Agent Collaboration Tuning Framework for Enhancing Small Language Models [8.123272461141815]
We introduce the TinyAgent model, trained on a meticulously curated high-quality dataset. We also present the Collaborative Multi-Agent Tuning (CMAT) framework, an innovative system designed to augment language agent capabilities. In this research, we propose a new communication agent framework that integrates multi-agent systems with environmental feedback mechanisms.
arXiv Detail & Related papers (2024-04-02T06:07:35Z)
Characteristic AI Agents via Large Language Models [40.10858767752735]
This research focuses on investigating the performance of Large Language Models in constructing characteristic AI agents. A dataset called Character100'' is built for this benchmark, comprising the most-visited people on Wikipedia for language models to role-play. The experimental results underscore the potential directions for further improvement in the capabilities of LLMs in constructing characteristic AI agents.
arXiv Detail & Related papers (2024-03-19T02:25:29Z)
Experiential Co-Learning of Software-Developing Agents [83.34027623428096]
Large language models (LLMs) have brought significant changes to various domains, especially in software development. We introduce Experiential Co-Learning, a novel LLM-agent learning framework. Experiments demonstrate that the framework enables agents to tackle unseen software-developing tasks more effectively.
arXiv Detail & Related papers (2023-12-28T13:50:42Z)
AgentBench: Evaluating LLMs as Agents [88.45506148281379]
Large Language Models (LLMs) are becoming increasingly smart and autonomous, targeting real-world pragmatic missions beyond traditional NLP tasks. We present AgentBench, a benchmark that currently consists of 8 distinct environments to assess LLM-as-Agent's reasoning and decision-making abilities.
arXiv Detail & Related papers (2023-08-07T16:08:11Z)
The Next Chapter: A Study of Large Language Models in Storytelling [51.338324023617034]
The application of prompt-based learning with large language models (LLMs) has exhibited remarkable performance in diverse natural language processing (NLP) tasks. This paper conducts a comprehensive investigation, utilizing both automatic and human evaluation, to compare the story generation capacity of LLMs with recent models. The results demonstrate that LLMs generate stories of significantly higher quality compared to other story generation models.
arXiv Detail & Related papers (2023-01-24T02:44:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.