From Voice to Value: Leveraging AI to Enhance Spoken Online Reviews on the Go
- URL: http://arxiv.org/abs/2412.05445v2
- Date: Tue, 10 Dec 2024 19:31:29 GMT
- Title: From Voice to Value: Leveraging AI to Enhance Spoken Online Reviews on the Go
- Authors: Kavindu Ravishan, Dániel Szabó, Niels van Berkel, Aku Visuri, Chi-Lan Yang, Koji Yatani, Simo Hosio,
- Abstract summary: We developed Vocalizer, a mobile application that enables users to provide reviews through voice input.
Our findings show that users frequently utilized the AI agent to add more detailed information to their reviews.
We also show how interactive AI features can improve users self-efficacy and willingness to share reviews online.
- Score: 21.811104609265158
- License:
- Abstract: Online reviews help people make better decisions. Review platforms usually depend on typed input, where leaving a good review requires significant effort because users must carefully organize and articulate their thoughts. This may discourage users from leaving comprehensive and high-quality reviews, especially when they are on the go. To address this challenge, we developed Vocalizer, a mobile application that enables users to provide reviews through voice input, with enhancements from a large language model (LLM). In a longitudinal study, we analysed user interactions with the app, focusing on AI-driven features that help refine and improve reviews. Our findings show that users frequently utilized the AI agent to add more detailed information to their reviews. We also show how interactive AI features can improve users self-efficacy and willingness to share reviews online. Finally, we discuss the opportunities and challenges of integrating AI assistance into review-writing systems.
Related papers
- Enhancing AI Assisted Writing with One-Shot Implicit Negative Feedback [6.175028561101999]
Nifty is an approach that uses classifier guidance to controllably integrate implicit user feedback into the text generation process.
We find up to 34% improvement in Rouge-L, 89% improvement in generating the correct intent, and an 86% win-rate according to human evaluators.
arXiv Detail & Related papers (2024-10-14T18:50:28Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset.
Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z) - Continually Improving Extractive QA via Human Feedback [59.49549491725224]
We study continually improving an extractive question answering (QA) system via human user feedback.
We conduct experiments involving thousands of user interactions under diverse setups to broaden the understanding of learning from feedback over time.
arXiv Detail & Related papers (2023-05-21T14:35:32Z) - Collaboration with Conversational AI Assistants for UX Evaluation:
Questions and How to Ask them (Voice vs. Text) [18.884080068561843]
We conducted a Wizard-of-Oz design probe study with 20 participants who interacted with simulated AI assistants via text or voice.
We found that participants asked for five categories of information: user actions, user mental model, help from the AI assistant, product and task information, and user demographics.
The text assistant was perceived as significantly more efficient, but both were rated equally in satisfaction and trust.
arXiv Detail & Related papers (2023-03-07T03:59:14Z) - What Do End-Users Really Want? Investigation of Human-Centered XAI for
Mobile Health Apps [69.53730499849023]
We present a user-centered persona concept to evaluate explainable AI (XAI)
Results show that users' demographics and personality, as well as the type of explanation, impact explanation preferences.
Our insights bring an interactive, human-centered XAI closer to practical application.
arXiv Detail & Related papers (2022-10-07T12:51:27Z) - Suggestion Lists vs. Continuous Generation: Interaction Design for
Writing with Generative Models on Mobile Devices Affect Text Length, Wording
and Perceived Authorship [27.853155569154705]
We present two user interfaces for writing with AI on mobile devices, which manipulate levels of initiative and control.
With AI suggestions, people wrote less actively, yet felt they were the author.
In both designs, AI increased text length and was perceived to influence wording.
arXiv Detail & Related papers (2022-08-01T13:57:11Z) - TOUR: Dynamic Topic and Sentiment Analysis of User Reviews for Assisting
App Release [34.529117157417176]
TOUR is able to (i) detect and summarize emerging app issues over app versions, (ii) identify user sentiment towards app features, and (iii) prioritize important user reviews for facilitating developers' examination.
arXiv Detail & Related papers (2021-03-26T08:44:55Z) - Emerging App Issue Identification via Online Joint Sentiment-Topic
Tracing [66.57888248681303]
We propose a novel emerging issue detection approach named MERIT.
Based on the AOBST model, we infer the topics negatively reflected in user reviews for one app version.
Experiments on popular apps from Google Play and Apple's App Store demonstrate the effectiveness of MERIT.
arXiv Detail & Related papers (2020-08-23T06:34:05Z) - Automating App Review Response Generation [67.58267006314415]
We propose a novel approach RRGen that automatically generates review responses by learning knowledge relations between reviews and their responses.
Experiments on 58 apps and 309,246 review-response pairs highlight that RRGen outperforms the baselines by at least 67.4% in terms of BLEU-4.
arXiv Detail & Related papers (2020-02-10T05:23:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.