Generative Goal Modeling
- URL: http://arxiv.org/abs/2509.01048v1
- Date: Mon, 01 Sep 2025 01:14:26 GMT
- Title: Generative Goal Modeling
- Authors: Ateeq Sharfuddin, Travis Breaux,
- Abstract summary: In software engineering, requirements may be acquired from stakeholders through elicitation methods.<n>Business analysts must review transcripts to identify and document requirements.<n>Goal modeling is a popular technique for representing early stakeholder requirements.
- Score: 0.40105987447353786
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In software engineering, requirements may be acquired from stakeholders through elicitation methods, such as interviews, observational studies, and focus groups. When supporting acquisition from interviews, business analysts must review transcripts to identify and document requirements. Goal modeling is a popular technique for representing early stakeholder requirements as it lends itself to various analyses, including refinement to map high-level goals into software operations, and conflict and obstacle analysis. In this paper, we describe an approach to use textual entailment to reliably extract goals from interview transcripts and to construct goal models. The approach has been evaluated on 15 interview transcripts across 29 application domains. The findings show that GPT-4o can reliably extract goals from interview transcripts, matching 62.0% of goals acquired by humans from the same transcripts, and that GPT-4o can trace goals to originating text in the transcript with 98.7% accuracy. In addition, when evaluated by human annotators, GPT-4o generates goal model refinement relationships among extracted goals with 72.2% accuracy.
Related papers
- Generative Large Language Models (gLLMs) in Content Analysis: A Practical Guide for Communication Research [2.390467032220061]
Generative Large Language Models (gLLMs) are increasingly being used in communication research for content analysis.<n>Despite their potential, the integration of gLLMs into the methodological toolkit of communication research remains underdeveloped.<n>This paper synthesizes emerging research on gLLM-assisted quantitative content analysis and proposes a comprehensive best-practice guide to navigate these challenges.
arXiv Detail & Related papers (2025-10-28T12:01:43Z) - Just-In-Time Objectives: A General Approach for Specialized AI Interactions [55.20968270133027]
Large language models promise a broad set of functions, but when not given a specific objective, they default to milquetoast results such as drafting emails littered with cliches.<n>We demonstrate that inferring the user's in-the-moment objective, then rapidly optimizing for that singular objective, enables LLMs to produce tools, interfaces, and responses that are more responsive and desired.
arXiv Detail & Related papers (2025-10-16T11:53:17Z) - CLaC at SemEval-2025 Task 6: A Multi-Architecture Approach for Corporate Environmental Promise Verification [0.20482269513546458]
This paper presents our approach to the SemEval-2025 Task6 (PromiseEval), which focuses on verifying promises in corporate ESG (Environmental, Social, and Governance) reports.<n>We explore three model architectures to address the four subtasks of promise identification, supporting evidence assessment, clarity evaluation, and verification timing.<n>Our work highlights the effectiveness of linguistic feature extraction, attention pooling, and multi-objective learning in promise verification tasks, despite challenges posed by class imbalance and limited training data.
arXiv Detail & Related papers (2025-05-29T15:19:00Z) - Vision-Language Model Based Handwriting Verification [23.983430206133793]
This paper explores using Vision Language Models (VLMs), such as OpenAI's GPT-4o and Google's PaliGemma, to address these challenges.
Our goal is to provide clear, human-understandable explanations for model decisions.
arXiv Detail & Related papers (2024-07-31T17:57:32Z) - MetaKP: On-Demand Keyphrase Generation [52.48698290354449]
We introduce on-demand keyphrase generation, a novel paradigm that requires keyphrases that conform to specific high-level goals or intents.
We present MetaKP, a large-scale benchmark comprising four datasets, 7500 documents, and 3760 goals across news and biomedical domains with human-annotated keyphrases.
We demonstrate the potential of our method to serve as a general NLP infrastructure, exemplified by its application in epidemic event detection from social media.
arXiv Detail & Related papers (2024-06-28T19:02:59Z) - Use of a Structured Knowledge Base Enhances Metadata Curation by Large Language Models [2.186740861187042]
Metadata play a crucial role in ensuring the findability, accessibility, interoperability, and reusability of datasets.<n>This paper investigates the potential of large language models (LLMs) to improve adherence to metadata standards.<n>We conducted experiments on 200 random data records describing human samples relating to lung cancer from the NCBI BioSample repository.
arXiv Detail & Related papers (2024-04-08T22:29:53Z) - GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing [74.68232970965595]
Multimodal large language models (MLLMs) are designed to process and integrate information from multiple sources, such as text, speech, images, and videos.
This paper assesses the application of MLLMs with 5 crucial abilities for affective computing, spanning from visual affective tasks and reasoning tasks.
arXiv Detail & Related papers (2024-03-09T13:56:25Z) - Integrating Self-supervised Speech Model with Pseudo Word-level Targets
from Visually-grounded Speech Model [57.78191634042409]
We propose Pseudo-Word HuBERT (PW-HuBERT), a framework that integrates pseudo word-level targets into the training process.
Our experimental results on four spoken language understanding (SLU) benchmarks suggest the superiority of our model in capturing semantic information.
arXiv Detail & Related papers (2024-02-08T16:55:21Z) - From Voices to Validity: Leveraging Large Language Models (LLMs) for
Textual Analysis of Policy Stakeholder Interviews [14.135107583299277]
This study explores the integration of Large Language Models (LLMs) with human expertise to enhance text analysis of stakeholder interviews regarding K-12 education policy within one U.S. state.
Using a mixed-methods approach, human experts developed a codebook and coding processes as informed by domain knowledge and unsupervised topic modeling results.
Results reveal that while GPT-4 thematic coding aligned with human coding by 77.89% at specific themes, expanding to broader themes increased congruence to 96.02%, surpassing traditional Natural Language Processing (NLP) methods by over 25%.
arXiv Detail & Related papers (2023-12-02T18:55:14Z) - Automated title and abstract screening for scoping reviews using the
GPT-4 Large Language Model [0.0]
GPTscreenR is a package for the R statistical programming language that uses the GPT-4 Large Language Model (LLM) to automatically screen sources.
In validation against consensus human reviewer decisions, GPTscreenR performed similarly to an alternative zero-shot technique, with a sensitivity of 71%, specificity of 89%, and overall accuracy of 84%.
arXiv Detail & Related papers (2023-11-14T05:30:43Z) - SOUL: Towards Sentiment and Opinion Understanding of Language [96.74878032417054]
We propose a new task called Sentiment and Opinion Understanding of Language (SOUL)
SOUL aims to evaluate sentiment understanding through two subtasks: Review (RC) and Justification Generation (JG)
arXiv Detail & Related papers (2023-10-27T06:48:48Z) - Prometheus: Inducing Fine-grained Evaluation Capability in Language
Models [66.12432440863816]
We propose Prometheus, a fully open-source Large Language Model (LLM) that is on par with GPT-4's evaluation capabilities.
Prometheus scores a Pearson correlation of 0.897 with human evaluators when evaluating with 45 customized score rubrics.
Prometheus achieves the highest accuracy on two human preference benchmarks.
arXiv Detail & Related papers (2023-10-12T16:50:08Z) - Tool-Augmented Reward Modeling [58.381678612409]
We propose a tool-augmented preference modeling approach, named Themis, to address limitations by empowering RMs with access to external environments.
Our study delves into the integration of external tools into RMs, enabling them to interact with diverse external sources.
In human evaluations, RLHF trained with Themis attains an average win rate of 32% when compared to baselines.
arXiv Detail & Related papers (2023-10-02T09:47:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.