T-FREX: A Transformer-based Feature Extraction Method from Mobile App
Reviews
- URL: http://arxiv.org/abs/2401.03833v1
- Date: Mon, 8 Jan 2024 11:43:03 GMT
- Title: T-FREX: A Transformer-based Feature Extraction Method from Mobile App
Reviews
- Authors: Quim Motger, Alessio Miaschi, Felice Dell'Orletta, Xavier Franch,
Jordi Marco
- Abstract summary: We present T-FREX, a Transformer-based, fully automatic approach for mobile app review feature extraction.
First, we collect a set of ground truth features from users in a real crowdsourced software recommendation platform.
Then, we use this newly created dataset to fine-tune multiple LLMs on a named entity recognition task.
- Score: 5.235401361674881
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mobile app reviews are a large-scale data source for software-related
knowledge generation activities, including software maintenance, evolution and
feedback analysis. Effective extraction of features (i.e., functionalities or
characteristics) from these reviews is key to support analysis on the
acceptance of these features, identification of relevant new feature requests
and prioritization of feature development, among others. Traditional methods
focus on syntactic pattern-based approaches, typically context-agnostic,
evaluated on a closed set of apps, difficult to replicate and limited to a
reduced set and domain of apps. Meanwhile, the pervasiveness of Large Language
Models (LLMs) based on the Transformer architecture in software engineering
tasks lays the groundwork for empirical evaluation of the performance of these
models to support feature extraction. In this study, we present T-FREX, a
Transformer-based, fully automatic approach for mobile app review feature
extraction. First, we collect a set of ground truth features from users in a
real crowdsourced software recommendation platform and transfer them
automatically into a dataset of app reviews. Then, we use this newly created
dataset to fine-tune multiple LLMs on a named entity recognition task under
different data configurations. We assess the performance of T-FREX with respect
to this ground truth, and we complement our analysis by comparing T-FREX with a
baseline method from the field. Finally, we assess the quality of new features
predicted by T-FREX through an external human evaluation. Results show that
T-FREX outperforms on average the traditional syntactic-based method,
especially when discovering new features from a domain for which the model has
been fine-tuned.
Related papers
- Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark [62.58869921806019]
We propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset.
We design innovative training strategies to effectively distill GPT-4o's evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6.
Experimental results demonstrate that our distilled open-source MLLM significantly outperforms the current state-of-the-art GPT-4o-base baseline.
arXiv Detail & Related papers (2024-11-23T08:06:06Z) - Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a time series forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.
We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.
Our experiments highlight the importance of incorporating contextual information, demonstrate surprising performance when using LLM-based forecasting models, and also reveal some of their critical shortcomings.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - Instruct-DeBERTa: A Hybrid Approach for Aspect-based Sentiment Analysis on Textual Reviews [2.0143010051030417]
Aspect-based Sentiment Analysis (ABSA) is a critical task in Natural Language Processing (NLP)
Traditional sentiment analysis methods, while useful for determining overall sentiment, often miss the implicit opinions about particular product or service features.
This paper presents a comprehensive review of the evolution of ABSA methodologies, from lexicon-based approaches to machine learning.
arXiv Detail & Related papers (2024-08-23T16:31:07Z) - Leveraging Large Language Models for Mobile App Review Feature Extraction [4.879919005707447]
This study explores the hypothesis that encoder-only large language models can enhance feature extraction from mobile app reviews.
By leveraging crowdsourced annotations from an industrial context, we redefine feature extraction as a supervised token classification task.
Empirical evaluations demonstrate that this method improves the precision and recall of extracted features and enhances performance efficiency.
arXiv Detail & Related papers (2024-08-02T07:31:57Z) - Self-Augmented Preference Optimization: Off-Policy Paradigms for Language Model Alignment [104.18002641195442]
We introduce Self-Augmented Preference Optimization (SAPO), an effective and scalable training paradigm that does not require existing paired data.
Building on the self-play concept, which autonomously generates negative responses, we further incorporate an off-policy learning pipeline to enhance data exploration and exploitation.
arXiv Detail & Related papers (2024-05-31T14:21:04Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - Open World Object Detection in the Era of Foundation Models [53.683963161370585]
We introduce a new benchmark that includes five real-world application-driven datasets.
We introduce a novel method, Foundation Object detection Model for the Open world, or FOMO, which identifies unknown objects based on their shared attributes with the base known objects.
arXiv Detail & Related papers (2023-12-10T03:56:06Z) - Transferability Metrics for Object Detection [0.0]
Transfer learning aims to make the most of existing pre-trained models to achieve better performance on a new task in limited data scenarios.
We extend transferability metrics to object detection using ROI-Align and TLogME.
We show that TLogME provides a robust correlation with transfer performance and outperforms other transferability metrics on local and global level features.
arXiv Detail & Related papers (2023-06-27T08:49:31Z) - Evaluating Pre-Trained Models for User Feedback Analysis in Software
Engineering: A Study on Classification of App-Reviews [2.66512000865131]
We study the accuracy and time efficiency of pre-trained neural language models (PTMs) for app review classification.
We set up different studies to evaluate PTMs in multiple settings.
In all cases, Micro and Macro Precision, Recall, and F1-scores will be used.
arXiv Detail & Related papers (2021-04-12T23:23:45Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z) - Rank-Based Multi-task Learning for Fair Regression [9.95899391250129]
We develop a novel learning approach for multi-taskart regression models based on a biased dataset.
We use a popular non-parametric oracle-based non-world multipliers dataset.
arXiv Detail & Related papers (2020-09-23T22:32:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.