Towards a Data-Driven Requirements Engineering Approach: Automatic
Analysis of User Reviews
- URL: http://arxiv.org/abs/2206.14669v1
- Date: Wed, 29 Jun 2022 14:14:54 GMT
- Title: Towards a Data-Driven Requirements Engineering Approach: Automatic
Analysis of User Reviews
- Authors: Jialiang Wei, Anne-Lise Courbis, Thomas Lambolais, Binbin Xu, Pierre
Louis Bernard, G\'erard Dray
- Abstract summary: We provide an automated analysis using CamemBERT, which is a state-of-the-art language model in French.
We created a multi-label classification dataset of 6000 user reviews from three applications in the Health & Fitness field.
The results are encouraging and suggest that it's possible to identify automatically the reviews concerning requests for new features.
- Score: 0.440401067183266
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We are concerned by Data Driven Requirements Engineering, and in particular
the consideration of user's reviews. These online reviews are a rich source of
information for extracting new needs and improvement requests. In this work, we
provide an automated analysis using CamemBERT, which is a state-of-the-art
language model in French. We created a multi-label classification dataset of
6000 user reviews from three applications in the Health & Fitness field. The
results are encouraging and suggest that it's possible to identify
automatically the reviews concerning requests for new features.
Dataset is available at:
https://github.com/Jl-wei/APIA2022-French-user-reviews-classification-dataset.
Related papers
- AutoBencher: Creating Salient, Novel, Difficult Datasets for Language Models [84.65095045762524]
We present three desiderata for a good benchmark for language models.
benchmark reveals new trends in model rankings not shown by previous benchmarks.
We use AutoBencher to create datasets for math, multilingual, and knowledge-intensive question answering.
arXiv Detail & Related papers (2024-07-11T10:03:47Z) - Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation [65.16137964758612]
We explore the use of long-context capabilities in large language models to create synthetic reading comprehension data from entire books.
Our objective is to test the capabilities of LLMs to analyze, understand, and reason over problems that require a detailed comprehension of long spans of text.
arXiv Detail & Related papers (2024-05-31T20:15:10Z) - BESTMVQA: A Benchmark Evaluation System for Medical Visual Question
Answering [8.547600133510551]
This paper develops a Benchmark Evaluation SysTem for Medical Visual Question Answering, denoted by BESTMVQA.
Our system provides a useful tool for users to automatically build Med-VQA datasets, which helps overcoming the data insufficient problem.
With simple configurations, our system automatically trains and evaluates the selected models over a benchmark dataset.
arXiv Detail & Related papers (2023-12-13T03:08:48Z) - Zero-shot Bilingual App Reviews Mining with Large Language Models [0.7340017786387767]
Mini-BAR is a tool that integrates large language models (LLMs) to perform zero-shot mining of user reviews in both English and French.
To evaluate the performance of Mini-BAR, we created a dataset containing 6,000 English and 6,000 French annotated user reviews.
arXiv Detail & Related papers (2023-11-06T12:36:46Z) - Instruct and Extract: Instruction Tuning for On-Demand Information
Extraction [86.29491354355356]
On-Demand Information Extraction aims to fulfill the personalized demands of real-world users.
We present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set.
Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE.
arXiv Detail & Related papers (2023-10-24T17:54:25Z) - UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset.
Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z) - Can GitHub Issues Help in App Review Classifications? [0.7366405857677226]
We propose a novel approach that assists in augmenting labeled datasets by utilizing information extracted from GitHub issues.
Our results demonstrate that using labeled issues for data augmentation can improve the F1-score to 6.3 in bug reports and 7.2 in feature requests.
arXiv Detail & Related papers (2023-08-27T22:01:24Z) - GEMv2: Multilingual NLG Benchmarking in a Single Line of Code [161.1761414080574]
Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers.
GEMv2 supports 40 documented datasets in 51 languages.
Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.
arXiv Detail & Related papers (2022-06-22T17:52:30Z) - E-commerce Query-based Generation based on User Review [1.484852576248587]
We propose a novel seq2seq based text generation model to generate answers to user's question based on reviews posted by previous users.
Given a user question and/or target sentiment polarity, we extract aspects of interest and generate an answer that summarizes previous relevant user reviews.
arXiv Detail & Related papers (2020-11-11T04:58:31Z) - Scaling Systematic Literature Reviews with Machine Learning Pipelines [57.82662094602138]
Systematic reviews entail the extraction of data from scientific documents.
We construct a pipeline that automates each of these aspects, and experiment with many human-time vs. system quality trade-offs.
We find that we can get surprising accuracy and generalisability of the whole pipeline system with only 2 weeks of human-expert annotation.
arXiv Detail & Related papers (2020-10-09T16:19:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.