Pre-Trained Neural Language Models for Automatic Mobile App User
Feedback Answer Generation
- URL: http://arxiv.org/abs/2202.02294v1
- Date: Fri, 4 Feb 2022 18:26:55 GMT
- Title: Pre-Trained Neural Language Models for Automatic Mobile App User
Feedback Answer Generation
- Authors: Yue Cao, Fatemeh H. Fard
- Abstract summary: Studies show that developers' answers to the mobile app users' feedbacks on app stores can increase the apps' star rating.
To help app developers generate answers that are related to the users' issues, recent studies develop models to generate the answers automatically.
In this paper, we evaluate pre-trained neural language models (PTMs) to generate replies to the mobile app user feedbacks.
- Score: 9.105367401167129
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Studies show that developers' answers to the mobile app users' feedbacks on
app stores can increase the apps' star rating. To help app developers generate
answers that are related to the users' issues, recent studies develop models to
generate the answers automatically. Aims: The app response generation models
use deep neural networks and require training data. Pre-Trained neural language
Models (PTM) used in Natural Language Processing (NLP) take advantage of the
information they learned from a large corpora in an unsupervised manner, and
can reduce the amount of required training data. In this paper, we evaluate
PTMs to generate replies to the mobile app user feedbacks. Method: We train a
Transformer model from scratch and fine-tune two PTMs to evaluate the generated
responses, which are compared to RRGEN, a current app response model. We also
evaluate the models with different portions of the training data. Results: The
results on a large dataset evaluated by automatic metrics show that PTMs obtain
lower scores than the baselines. However, our human evaluation confirms that
PTMs can generate more relevant and meaningful responses to the posted
feedbacks. Moreover, the performance of PTMs has less drop compared to other
models when the amount of training data is reduced to 1/3. Conclusion: PTMs are
useful in generating responses to app reviews and are more robust models to the
amount of training data provided. However, the prediction time is 19X than
RRGEN. This study can provide new avenues for research in adapting the PTMs for
analyzing mobile app user feedbacks. Index Terms-mobile app user feedback
analysis, neural pre-trained language models, automatic answer generation
Related papers
- Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration [90.41908331897639]
Large language models (LLMs) have significantly benefited from training on diverse, high-quality task-specific data.
We present a novel approach, ReverseGen, designed to automatically generate effective training samples.
arXiv Detail & Related papers (2024-10-22T06:43:28Z) - Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Large Language Model (LLM) pretraining traditionally relies on autoregressive language modeling on randomly sampled data blocks from web-scale datasets.
We take inspiration from human learning techniques like spaced repetition to hypothesize that random data sampling for LLMs leads to high training cost and low quality models which tend to forget data.
In order to effectively commit web-scale information to long-term memory, we propose the LFR (Learn, Focus, and Review) pedagogy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z) - Self-Taught Evaluators [77.92610887220594]
We present an approach that aims to im-proves without human annotations, using synthetic training data only.
Our Self-Taught Evaluator can improve a strong LLM from 75.4 to 88.3 on RewardBench.
arXiv Detail & Related papers (2024-08-05T17:57:02Z) - Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models [115.501751261878]
Fine-tuning language models(LMs) on human-generated data remains a prevalent practice.
We investigate whether we can go beyond human data on tasks where we have access to scalar feedback.
We find that ReST$EM$ scales favorably with model size and significantly surpasses fine-tuning only on human data.
arXiv Detail & Related papers (2023-12-11T18:17:43Z) - INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of
Language Models [40.54353850357839]
We show how we can employ submodular optimization to select highly representative subsets of the training corpora.
We show that the resulting models achieve up to $sim99%$ of the performance of the fully-trained models.
arXiv Detail & Related papers (2023-05-11T09:24:41Z) - Federated Learning with Noisy User Feedback [26.798303045807508]
Federated learning (FL) has emerged as a method for training ML models on edge devices using sensitive user data.
We propose a strategy for training FL models using positive and negative user feedback.
We show that our method improves substantially over a self-training baseline, achieving performance closer to models trained with full supervision.
arXiv Detail & Related papers (2022-05-06T09:14:24Z) - Evaluating Pre-Trained Models for User Feedback Analysis in Software
Engineering: A Study on Classification of App-Reviews [2.66512000865131]
We study the accuracy and time efficiency of pre-trained neural language models (PTMs) for app review classification.
We set up different studies to evaluate PTMs in multiple settings.
In all cases, Micro and Macro Precision, Recall, and F1-scores will be used.
arXiv Detail & Related papers (2021-04-12T23:23:45Z) - Utilizing Self-supervised Representations for MOS Prediction [51.09985767946843]
Existing evaluations usually require clean references or parallel ground truth data.
Subjective tests, on the other hand, do not need any additional clean or parallel data and correlates better to human perception.
We develop an automatic evaluation approach that correlates well with human perception while not requiring ground truth data.
arXiv Detail & Related papers (2021-04-07T09:44:36Z) - Dialogue Response Ranking Training with Large-Scale Human Feedback Data [52.12342165926226]
We leverage social media feedback data to build a large-scale training dataset for feedback prediction.
We trained DialogRPT, a set of GPT-2 based models on 133M pairs of human feedback data.
Our ranker outperforms the conventional dialog perplexity baseline with a large margin on predicting Reddit feedback.
arXiv Detail & Related papers (2020-09-15T10:50:05Z) - An Efficient Method of Training Small Models for Regression Problems
with Knowledge Distillation [1.433758865948252]
We propose a new formalism of knowledge distillation for regression problems.
First, we propose a new loss function, teacher outlier loss rejection, which rejects outliers in training samples using teacher model predictions.
By considering the multi-task network, training of the feature extraction of student models becomes more effective.
arXiv Detail & Related papers (2020-02-28T08:46:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.