Related papers: USeR: A Web-based User Story eReviewer for Assisted Quality Optimizations

USeR: A Web-based User Story eReviewer for Assisted Quality Optimizations

URL: http://arxiv.org/abs/2503.02049v1
Date: Mon, 03 Mar 2025 21:02:10 GMT
Title: USeR: A Web-based User Story eReviewer for Assisted Quality Optimizations
Authors: Daniel Hallmann, Kerstin Jacob, Gerald Lüttgen, Ute Schmid, Rüdiger von der Weth,
Abstract summary: Multiple user story quality guidelines exist, but authors like Product Owners in industry projects frequently fail to write high-quality user stories.<n>This situation is exacerbated by the lack of tools for assessing user story quality.<n>We propose User Story eReviewer (USeR) a web-based tool that allows authors to determine and optimize user story quality.
Score: 2.746265158172294
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: User stories are widely applied for conveying requirements within agile software development teams. Multiple user story quality guidelines exist, but authors like Product Owners in industry projects frequently fail to write high-quality user stories. This situation is exacerbated by the lack of tools for assessing user story quality. In this paper, we propose User Story eReviewer (USeR) a web-based tool that allows authors to determine and optimize user story quality. For developing USeR, we collected 77 potential quality metrics through literature review, practitioner sessions, and research group meetings and refined these to 34 applicable metrics through expert sessions. Finally, we derived algorithms for eight prioritized metrics using a literature review and research group meetings and implemented them with plain code and machine learning techniques. USeR offers a RESTful API and user interface for instant, consistent, and explainable user feedback supporting fast and easy quality optimizations. It has been empirically evaluated with an expert study using 100 user stories and four experts from two real-world agile software projects in the automotive and health sectors.

Related papers

Exploring LLMs Impact on Student-Created User Stories and Acceptance Testing in Software Development [0.0]
This study investigates how LLMs (large language models) affect undergraduate software engineering students' ability to transform user feedback into user stories.<n>Students, working individually, were asked to analyze user feedback comments, appropriately group related items, and create user stories.<n>We found that LLMs help students develop valuable stories with well-defined acceptance criteria.
arXiv Detail & Related papers (2025-02-04T19:35:44Z)
Towards Realistic Evaluation of Commit Message Generation by Matching Online and Offline Settings [77.20838441870151]
We use an online metric - the number of edits users introduce before committing the generated messages to the VCS - to select metrics for offline experiments.<n>We collect a dataset with 57 pairs consisting of commit messages generated by GPT-4 and their counterparts edited by human experts.<n>Our results indicate that edit distance exhibits the highest correlation with the online metric, whereas commonly used similarity metrics such as BLEU and METEOR demonstrate low correlation.
arXiv Detail & Related papers (2024-10-15T20:32:07Z)
What Makes a Good Story and How Can We Measure It? A Comprehensive Survey of Story Evaluation [57.550045763103334]
evaluating a story can be more challenging than other generation evaluation tasks. We first summarize existing storytelling tasks, including text-to-text, visual-to-text, and text-to-visual. We propose a taxonomy to organize evaluation metrics that have been developed or can be adopted for story evaluation.
arXiv Detail & Related papers (2024-08-26T20:35:42Z)
Improving Ontology Requirements Engineering with OntoChat and Participatory Prompting [3.3241053483599563]
ORE has primarily relied on manual methods, such as interviews and collaborative forums, to gather user requirements from domain experts. Current OntoChat offers a framework for ORE that utilise large language models (LLMs) to streamline the process. This study produces pre-defined prompt templates based on user queries, focusing on creating and refining personas, goals, scenarios, sample data, and data resources for user stories.
arXiv Detail & Related papers (2024-08-09T19:21:14Z)
User Story Tutor (UST) to Support Agile Software Developers [0.4077787659104315]
We designed, implemented, applied, and evaluated a web application called User Story Tutor (UST) UST checks the description of a given User Story for readability, and if needed, recommends appropriate practices for improvement. UST may support the continuing education of agile development teams when writing and reviewing User Stories.
arXiv Detail & Related papers (2024-06-24T01:55:01Z)
LLM-based agents for automating the enhancement of user story quality: An early report [2.856781525749652]
This study explores the use of large language models to improve user story quality in Austrian Post Group IT agile teams. We developed a reference model for an Autonomous LLM-based Agent System and implemented it at the company. The quality of user stories in the study and the effectiveness of these agents for user story quality improvement was assessed by 11 participants across six agile teams.
arXiv Detail & Related papers (2024-03-14T14:35:53Z)
ChatGPT as a tool for User Story Quality Evaluation: Trustworthy Out of the Box? [3.6526713965824515]
This study explores using ChatGPT for user story quality evaluation and compares its performance with an existing benchmark. Our study shows that ChatGPT's evaluation aligns well with human evaluation, and we propose a best of three'' strategy to improve its output stability.
arXiv Detail & Related papers (2023-06-21T09:26:27Z)
Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: An Empirical Study [63.27346930921658]
ChatGPT is capable of evaluating text quality effectively from various perspectives without reference. The Explicit Score, which utilizes ChatGPT to generate a numeric score measuring text quality, is the most effective and reliable method among the three exploited approaches.
arXiv Detail & Related papers (2023-04-03T05:29:58Z)
FEBR: Expert-Based Recommendation Framework for beneficial and personalized content [77.86290991564829]
We propose FEBR (Expert-Based Recommendation Framework), an apprenticeship learning framework to assess the quality of the recommended content. The framework exploits the demonstrated trajectories of an expert (assumed to be reliable) in a recommendation evaluation environment, to recover an unknown utility function. We evaluate the performance of our solution through a user interest simulation environment (using RecSim)
arXiv Detail & Related papers (2021-07-17T18:21:31Z)
Online Learning Demands in Max-min Fairness [91.37280766977923]
We describe mechanisms for the allocation of a scarce resource among multiple users in a way that is efficient, fair, and strategy-proof. The mechanism is repeated for multiple rounds and a user's requirements can change on each round. At the end of each round, users provide feedback about the allocation they received, enabling the mechanism to learn user preferences over time.
arXiv Detail & Related papers (2020-12-15T22:15:20Z)
Mining Implicit Relevance Feedback from User Behavior for Web Question Answering [92.45607094299181]
We make the first study to explore the correlation between user behavior and passage relevance. Our approach significantly improves the accuracy of passage ranking without extra human labeled data. In practice, this work has proved effective to substantially reduce the human labeling cost for the QA service in a global commercial search engine.
arXiv Detail & Related papers (2020-06-13T07:02:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.