Related papers: User eXperience Perception Insights Dataset (UXPID): Synthetic User Feedback from Public Industrial Forums

User eXperience Perception Insights Dataset (UXPID): Synthetic User Feedback from Public Industrial Forums

URL: http://arxiv.org/abs/2509.11777v1
Date: Mon, 15 Sep 2025 10:58:41 GMT
Title: User eXperience Perception Insights Dataset (UXPID): Synthetic User Feedback from Public Industrial Forums
Authors: Mikhail Kulyabin, Jan Joosten, Choro Ulan uulu, Nuno Miguel Martins Pacheco, Fabian Ries, Filippos Petridis, Jan Bosch, Helena Holmström Olsson,
Abstract summary: Customer feedback in industrial forums reflect a rich but underexplored source of insight into real-world product experience.<n>This paper presents a collection of 7130 artificially synthesized and anonymized user feedback branches extracted from a public industrial automation forum.<n>The dataset is designed to facilitate research in user requirements, user experience (UX) analysis, and AI-driven feedback processing.
Score: 3.117921059331037
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Customer feedback in industrial forums reflect a rich but underexplored source of insight into real-world product experience. These publicly shared discussions offer an organic view of user expectations, frustrations, and success stories shaped by the specific contexts of use. Yet, harnessing this information for systematic analysis remains challenging due to the unstructured and domain-specific nature of the content. The lack of structure and specialized vocabulary makes it difficult for traditional data analysis techniques to accurately interpret, categorize, and quantify the feedback, thereby limiting its potential to inform product development and support strategies. To address these challenges, this paper presents the User eXperience Perception Insights Dataset (UXPID), a collection of 7130 artificially synthesized and anonymized user feedback branches extracted from a public industrial automation forum. Each JavaScript object notation (JSON) record contains multi-post comments related to specific hardware and software products, enriched with metadata and contextual conversation data. Leveraging a large language model (LLM), each branch is systematically analyzed and annotated for UX insights, user expectations, severity and sentiment ratings, and topic classifications. The UXPID dataset is designed to facilitate research in user requirements, user experience (UX) analysis, and AI-driven feedback processing, particularly where privacy and licensing restrictions limit access to real-world data. UXPID supports the training and evaluation of transformer-based models for tasks such as issue detection, sentiment analysis, and requirements extraction in the context of technical forums.

Related papers

Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset [47.98539809308384]
We analyze the Asta Interaction dataset, a large-scale resource comprising over 200,000 user queries and interaction logs.<n>We characterize query patterns, engagement behaviors, and how usage evolves with experience.<n>We release the anonymized dataset and analysis with a new query taxonomy to inform future designs of real-world AI research assistants.
arXiv Detail & Related papers (2026-02-26T18:40:28Z)
CoCoNUTS: Concentrating on Content while Neglecting Uninformative Textual Styles for AI-Generated Peer Review Detection [60.52240468810558]
We introduce CoCoNUTS, a content-oriented benchmark built upon a fine-grained dataset of AI-generated peer reviews.<n>We also develop CoCoDet, an AI review detector via a multi-task learning framework, to achieve more accurate and robust detection of AI involvement in review content.
arXiv Detail & Related papers (2025-08-28T06:03:11Z)
VIDEE: Visual and Interactive Decomposition, Execution, and Evaluation of Text Analytics with Intelligent Agents [39.42078665719841]
VIDEE is a system that supports entry-level data analysts to conduct advanced text analytics with intelligent agents.<n>We conduct two quantitative experiments to evaluate VIDEE's effectiveness and analyze common agent errors.
arXiv Detail & Related papers (2025-06-17T05:24:58Z)
What Users Value and Critique: Large-Scale Analysis of User Feedback on AI-Powered Mobile Apps [2.352412885878654]
We present the first comprehensive, large-scale study of user feedback on AI-powered mobile apps.<n>We leverage a curated dataset of 292 AI-driven apps across 14 categories with 894K AI-specific reviews from Google Play.<n>Our pipeline surfaces both satisfaction with one feature and frustration with another within the same review.
arXiv Detail & Related papers (2025-06-12T14:56:52Z)
Data Therapist: Eliciting Domain Knowledge from Subject Matter Experts Using Large Language Models [25.633548292173643]
Data Therapist is a web-based system that helps domain experts externalize tacit knowledge through a mixed-initiative process.<n>The resulting structured knowledge base can inform both human and automated visualization design.
arXiv Detail & Related papers (2025-05-01T11:10:17Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the Way [65.48447317310442]
Data Formulator 2 (DF2 for short) is an AI-powered visualization system designed to overcome this limitation.<n> DF2 blends graphical user interfaces and natural language inputs to enable users to convey their intent more effectively.<n>To support efficient iteration, DF2 lets users navigate their iteration history and reuse previous designs, eliminating the need to start from scratch each time.
arXiv Detail & Related papers (2024-08-28T20:12:17Z)
Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models [49.266685603250416]
Large language models (LLMs) are now accessible to anyone with a computer, a web browser, and an internet connection via browser-based interfaces.<n>This article examines how interactive feedback features in ChatGPT's interface afford user participation in LLMs.
arXiv Detail & Related papers (2024-08-27T13:50:37Z)
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [93.96463520716759]
We develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Knowledge Bases. Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine. We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties.
arXiv Detail & Related papers (2024-04-19T22:54:54Z)
AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models [34.82568259708465]
Allhands is an innovative analytic framework designed for large-scale feedback analysis through a natural language interface. LLMs are large language models that enhance accuracy, robustness, generalization, and user-friendliness. Allhands delivers comprehensive multi-modal responses, including text, code, tables, and images.
arXiv Detail & Related papers (2024-03-22T12:13:16Z)
Improving Retrieval in Theme-specific Applications using a Corpus Topical Taxonomy [52.426623750562335]
We introduce ToTER (Topical taxonomy Enhanced Retrieval) framework. ToTER identifies the central topics of queries and documents with the guidance of the taxonomy, and exploits their topical relatedness to supplement missing contexts. As a plug-and-play framework, ToTER can be flexibly employed to enhance various PLM-based retrievers.
arXiv Detail & Related papers (2024-03-07T02:34:54Z)
Instruct and Extract: Instruction Tuning for On-Demand Information Extraction [86.29491354355356]
On-Demand Information Extraction aims to fulfill the personalized demands of real-world users. We present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set. Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE.
arXiv Detail & Related papers (2023-10-24T17:54:25Z)
Design Challenges for a Multi-Perspective Search Engine [44.48345943046946]
We study a new perspective-oriented document retrieval paradigm. We discuss and assess the inherent natural language understanding challenges in order to achieve the goal. We use the prototype system to conduct a user survey in order to assess the utility of our paradigm.
arXiv Detail & Related papers (2021-12-15T18:59:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.