Revisiting Android App Categorization
- URL: http://arxiv.org/abs/2310.07290v1
- Date: Wed, 11 Oct 2023 08:25:34 GMT
- Title: Revisiting Android App Categorization
- Authors: Marco Alecci, Jordan Samhi, Tegawend\'e F. Bissyand\'e, Jacques Klein
- Abstract summary: We present a comprehensive evaluation of existing Android app categorization approaches using our new ground-truth dataset.
We propose two innovative approaches that effectively outperform the performance of existing methods in both description-based and APK-based methodologies.
- Score: 5.805764439228492
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Numerous tools rely on automatic categorization of Android apps as part of
their methodology. However, incorrect categorization can lead to inaccurate
outcomes, such as a malware detector wrongly flagging a benign app as
malicious. One such example is the SlideIT Free Keyboard app, which has over
500000 downloads on Google Play. Despite being a "Keyboard" app, it is often
wrongly categorized alongside "Language" apps due to the app's description
focusing heavily on language support, resulting in incorrect analysis outcomes,
including mislabeling it as a potential malware when it is actually a benign
app. Hence, there is a need to improve the categorization of Android apps to
benefit all the tools relying on it. In this paper, we present a comprehensive
evaluation of existing Android app categorization approaches using our new
ground-truth dataset. Our evaluation demonstrates the notable superiority of
approaches that utilize app descriptions over those solely relying on data
extracted from the APK file, while also leaving space for potential improvement
in the former category. Thus, we propose two innovative approaches that
effectively outperform the performance of existing methods in both
description-based and APK-based methodologies. Finally, by employing our novel
description-based approach, we have successfully demonstrated that adopting a
higher-performing categorization method can significantly benefit tools reliant
on app categorization, leading to an improvement in their overall performance.
This highlights the significance of developing advanced and efficient app
categorization methodologies for improved results in software engineering
tasks.
Related papers
- AppPoet: Large Language Model based Android malware detection via multi-view prompt engineering [1.3197408989895103]
AppPoet is a multi-view system for Android malware detection.
Our method achieves a detection accuracy of 97.15% and an F1 score of 97.21%.
arXiv Detail & Related papers (2024-04-29T15:52:45Z) - Learn to Categorize or Categorize to Learn? Self-Coding for Generalized
Category Discovery [49.1865089933055]
We propose a novel, efficient and self-supervised method capable of discovering previously unknown categories at test time.
A salient feature of our approach is the assignment of minimum length category codes to individual data instances.
Experimental evaluations, bolstered by state-of-the-art benchmark comparisons, testify to the efficacy of our solution.
arXiv Detail & Related papers (2023-10-30T17:45:32Z) - Continuous Learning for Android Malware Detection [15.818435778629635]
We propose a new hierarchical contrastive learning scheme, and a new sample selection technique to continuously train the Android malware classifier.
Our approach reduces the false negative rate from 14% (for the best baseline) to 9%, while also reducing the false positive rate (from 0.86% to 0.48%).
arXiv Detail & Related papers (2023-02-08T20:54:11Z) - Evaluating the Predictive Performance of Positive-Unlabelled
Classifiers: a brief critical review and practical recommendations for
improvement [77.34726150561087]
Positive-Unlabelled (PU) learning is a growing area of machine learning.
This paper critically reviews the main PU learning evaluation approaches and the choice of predictive accuracy measures in 51 articles proposing PU classifiers.
arXiv Detail & Related papers (2022-06-06T08:31:49Z) - Towards a Fair Comparison and Realistic Design and Evaluation Framework
of Android Malware Detectors [63.75363908696257]
We analyze 10 influential research works on Android malware detection using a common evaluation framework.
We identify five factors that, if not taken into account when creating datasets and designing detectors, significantly affect the trained ML models.
We conclude that the studied ML-based detectors have been evaluated optimistically, which justifies the good published results.
arXiv Detail & Related papers (2022-05-25T08:28:08Z) - Evaluating categorical encoding methods on a real credit card fraud
detection database [0.0]
We describe several well-known categorical encoding methods that are based on target statistics and weight of evidence.
We train the encoded databases using state-of-the-art gradient boosting methods and evaluate their performances.
The contribution of this work is twofold: (1) we compare many state-of-the-art "lite" categorical encoding methods on a large scale database and (2) we use a real credit card fraud detection database.
arXiv Detail & Related papers (2021-12-22T16:48:46Z) - Evaluating Pre-Trained Models for User Feedback Analysis in Software
Engineering: A Study on Classification of App-Reviews [2.66512000865131]
We study the accuracy and time efficiency of pre-trained neural language models (PTMs) for app review classification.
We set up different studies to evaluate PTMs in multiple settings.
In all cases, Micro and Macro Precision, Recall, and F1-scores will be used.
arXiv Detail & Related papers (2021-04-12T23:23:45Z) - Evaluating Large-Vocabulary Object Detectors: The Devil is in the
Details [107.2722027807328]
We find that the default implementation of AP is neither category independent, nor does it directly reward properly calibrated detectors.
We show that the default implementation produces a gameable metric, where a simple, nonsensical re-ranking policy can improve AP by a large margin.
We benchmark recent advances in large-vocabulary detection and find that many reported gains do not translate to improvements under our new per-class independent evaluation.
arXiv Detail & Related papers (2021-02-01T18:56:02Z) - Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain
Detection [60.88952532574564]
This paper conducts a thorough comparison of out-of-domain intent detection methods.
We evaluate multiple contextual encoders and methods, proven to be efficient, on three standard datasets for intent classification.
Our main findings show that fine-tuning Transformer-based encoders on in-domain data leads to superior results.
arXiv Detail & Related papers (2021-01-11T09:10:58Z) - Emerging App Issue Identification via Online Joint Sentiment-Topic
Tracing [66.57888248681303]
We propose a novel emerging issue detection approach named MERIT.
Based on the AOBST model, we infer the topics negatively reflected in user reviews for one app version.
Experiments on popular apps from Google Play and Apple's App Store demonstrate the effectiveness of MERIT.
arXiv Detail & Related papers (2020-08-23T06:34:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.