Understanding Mobile App Reviews to Guide Misuse Audits
- URL: http://arxiv.org/abs/2303.10795v3
- Date: Fri, 08 Nov 2024 10:51:46 GMT
- Title: Understanding Mobile App Reviews to Guide Misuse Audits
- Authors: Vaibhav Garg, Hui Guo, Nirav Ajmeri, Saikath Bhattacharya, Munindar P. Singh,
- Abstract summary: We leverage app reviews to identify exploitable apps and their functionalities that enable misuse.
Stories by abusers and victims mostly focus on past misuses, whereas stories by third parties mostly identify stories indicating the potential for misuse.
In total, we confirmed 156 exploitable apps facilitating the misuse.
- Score: 17.71313286969027
- License:
- Abstract: Problem: We address the challenge in responsible computing where an exploitable mobile app is misused by one app user (an abuser) against another user or bystander (victim). We introduce the idea of a misuse audit of apps as a way of determining if they are exploitable without access to their implementation. Method: We leverage app reviews to identify exploitable apps and their functionalities that enable misuse. First, we build a computational model to identify alarming reviews (which report misuse). Second, using the model, we identify exploitable apps and their functionalities. Third, we validate them through manual inspection of reviews. Findings: Stories by abusers and victims mostly focus on past misuses, whereas stories by third parties mostly identify stories indicating the potential for misuse. Surprisingly, positive reviews by abusers, which exhibit language with high dominance, also reveal misuses. In total, we confirmed 156 exploitable apps facilitating the misuse. Based on our qualitative analysis, we found exploitable apps exhibiting four types of exploitable functionalities. Implications: Our method can help identify exploitable apps and their functionalities, facilitating misuse audits of a large pool of apps.
Related papers
- A StrongREJECT for Empty Jailbreaks [72.8807309802266]
StrongREJECT is a high-quality benchmark for evaluating jailbreak performance.
It scores the harmfulness of a victim model's responses to forbidden prompts.
It achieves state-of-the-art agreement with human judgments of jailbreak effectiveness.
arXiv Detail & Related papers (2024-02-15T18:58:09Z) - Fairness Concerns in App Reviews: A Study on AI-based Mobile Apps [9.948068408730654]
This research aims to investigate fairness concerns raised in mobile app reviews.
Our research focuses on AI-based mobile app reviews as the chance of unfair behaviors and outcomes in AI-based apps may be higher than in non-AI-based apps.
arXiv Detail & Related papers (2024-01-16T03:43:33Z) - User Strategization and Trustworthy Algorithms [81.82279667028423]
We show that user strategization can actually help platforms in the short term.
We then show that it corrupts platforms' data and ultimately hurts their ability to make counterfactual decisions.
arXiv Detail & Related papers (2023-12-29T16:09:42Z) - User Driven Functionality Deletion for Mobile Apps [10.81190733388406]
Evolving software with an increasing number of features is harder to understand and thus harder to use.
Too much functionality can easily impact usability, maintainability, and resource consumption.
Previous work showed that the deletion of functionality is common and sometimes driven by user reviews.
arXiv Detail & Related papers (2023-05-30T19:56:54Z) - Explainable Abuse Detection as Intent Classification and Slot Filling [66.80201541759409]
We introduce the concept of policy-aware abuse detection, abandoning the unrealistic expectation that systems can reliably learn which phenomena constitute abuse from inspecting the data alone.
We show how architectures for intent classification and slot filling can be used for abuse detection, while providing a rationale for model decisions.
arXiv Detail & Related papers (2022-10-06T03:33:30Z) - Towards a Fair Comparison and Realistic Design and Evaluation Framework
of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework.
We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models.
We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z) - Erasing Labor with Labor: Dark Patterns and Lockstep Behaviors on Google
Play [13.658284581863839]
Google Play's policy forbids the use of incentivized installs, ratings, and reviews to manipulate the placement of apps.
We examine install-incentivizing apps through a socio-technical lens and perform a mixed-methods analysis of their reviews and permissions.
Our dataset contains 319K reviews collected daily over five months from 60 such apps that cumulatively account for over 160.5M installs.
We find evidence of fraudulent reviews on install-incentivizing apps, following which we model them as an edge stream in a dynamic bipartite graph of apps and reviewers.
arXiv Detail & Related papers (2022-02-09T16:54:27Z) - An Empirical Study on User Reviews Targeting Mobile Apps' Security &
Privacy [1.8033500402815792]
This study examines the privacy and security concerns of users using reviews in the Google Play Store.
We analyzed 2.2M reviews from the top 539 apps of this Android market.
It was evident from the results that the number of permissions that the apps request plays a dominant role in this matter.
arXiv Detail & Related papers (2020-10-11T02:00:36Z) - Emerging App Issue Identification via Online Joint Sentiment-Topic
Tracing [66.57888248681303]
We propose a novel emerging issue detection approach named MERIT.
Based on the AOBST model, we infer the topics negatively reflected in user reviews for one app version.
Experiments on popular apps from Google Play and Apple's App Store demonstrate the effectiveness of MERIT.
arXiv Detail & Related papers (2020-08-23T06:34:05Z) - Automating App Review Response Generation [67.58267006314415]
We propose a novel approach RRGen that automatically generates review responses by learning knowledge relations between reviews and their responses.
Experiments on 58 apps and 309,246 review-response pairs highlight that RRGen outperforms the baselines by at least 67.4% in terms of BLEU-4.
arXiv Detail & Related papers (2020-02-10T05:23:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.