Leveraging Large Language Models for Classifying App Users' Feedback
- URL: http://arxiv.org/abs/2507.08250v1
- Date: Fri, 11 Jul 2025 01:33:54 GMT
- Title: Leveraging Large Language Models for Classifying App Users' Feedback
- Authors: Yasaman Abedini, Abbas Heydarnoori,
- Abstract summary: We evaluate the capabilities of four advanced LLMs, including GPT-3.5-Turbo, GPT-4, Flan-T5, and Llama3-70b, to enhance user feedback classification.<n>Our findings indicate that LLMs when guided by well-crafted prompts, can effectively classify user feedback into coarse-grained categories.
- Score: 0.7366405857677226
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In recent years, significant research has been conducted into classifying application (app) user feedback, primarily relying on supervised machine learning algorithms. However, fine-tuning more generalizable classifiers based on existing labeled datasets remains an important challenge, as creating large and accurately labeled datasets often requires considerable time and resources. In this paper, we evaluate the capabilities of four advanced LLMs, including GPT-3.5-Turbo, GPT-4, Flan-T5, and Llama3-70b, to enhance user feedback classification and address the challenge of the limited labeled dataset. To achieve this, we conduct several experiments on eight datasets that have been meticulously labeled in prior research. These datasets include user reviews from app stores, posts from the X platform, and discussions from the public forums, widely recognized as representative sources of app user feedback. We analyze the performance of various LLMs in identifying both fine-grained and coarse-grained user feedback categories. Given the substantial volume of daily user feedback and the computational limitations of LLMs, we leverage these models as an annotation tool to augment labeled datasets with general and app-specific data. This augmentation aims to enhance the performance of state-of-the-art BERT-based classification models. Our findings indicate that LLMs when guided by well-crafted prompts, can effectively classify user feedback into coarse-grained categories. Moreover, augmenting the training dataset with datasets labeled using the consensus of LLMs can significantly enhance classifier performance.
Related papers
- From Selection to Generation: A Survey of LLM-based Active Learning [153.8110509961261]
Large Language Models (LLMs) have been employed for generating entirely new data instances and providing more cost-effective annotations.<n>This survey aims to serve as an up-to-date resource for researchers and practitioners seeking to gain an intuitive understanding of LLM-based AL techniques.
arXiv Detail & Related papers (2025-02-17T12:58:17Z) - Learning to Verify Summary Facts with Fine-Grained LLM Feedback [15.007479147796403]
Training automatic summary fact verifiers often faces the challenge of a lack of human-labeled data.<n>We introduce FineSumFact, a large-scale dataset containing fine-grained factual feedback on summaries.
arXiv Detail & Related papers (2024-12-14T05:28:44Z) - Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Learning to Predict Usage Options of Product Reviews with LLM-Generated Labels [14.006486214852444]
We propose a method of using LLMs as few-shot learners for annotating data in a complex natural language task.
Learning a custom model offers individual control over energy efficiency and privacy measures.
We find that the quality of the resulting data exceeds the level attained by third-party vendor services.
arXiv Detail & Related papers (2024-10-16T11:34:33Z) - Lifelong Personalized Low-Rank Adaptation of Large Language Models for Recommendation [50.837277466987345]
We focus on the field of large language models (LLMs) for recommendation.
We propose RecLoRA, which incorporates a Personalized LoRA module that maintains independent LoRAs for different users.
We also design a Few2Many Learning Strategy, using a conventional recommendation model as a lens to magnify small training spaces to full spaces.
arXiv Detail & Related papers (2024-08-07T04:20:28Z) - Large Language Models for Data Annotation and Synthesis: A Survey [49.8318827245266]
This survey focuses on the utility of Large Language Models for data annotation and synthesis.<n>It includes an in-depth taxonomy of data types that LLMs can annotate, a review of learning strategies for models utilizing LLM-generated annotations, and a detailed discussion of the primary challenges and limitations associated with using LLMs for data annotation and synthesis.
arXiv Detail & Related papers (2024-02-21T00:44:04Z) - Can LLMs Augment Low-Resource Reading Comprehension Datasets? Opportunities and Challenges [3.130575840003799]
GPT-4 can be used to augment existing reading comprehension datasets.
This work serves to be the first analysis of LLMs as synthetic data augmenters for QA systems.
arXiv Detail & Related papers (2023-09-21T18:48:02Z) - Large Language Models as Data Preprocessors [9.99065004972981]
Large Language Models (LLMs) have marked a significant advancement in artificial intelligence.
This study explores their potential in data preprocessing, a critical stage in data mining and analytics applications.
We propose an LLM-based framework for data preprocessing, which integrates cutting-edge prompt engineering techniques.
arXiv Detail & Related papers (2023-08-30T23:28:43Z) - AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks.
We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate.
We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z) - DataPerf: Benchmarks for Data-Centric AI Development [81.03754002516862]
DataPerf is a community-led benchmark suite for evaluating ML datasets and data-centric algorithms.
We provide an open, online platform with multiple rounds of challenges to support this iterative development.
The benchmarks, online evaluation platform, and baseline implementations are open source.
arXiv Detail & Related papers (2022-07-20T17:47:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.