Interpretable by Design: MH-AutoML for Transparent and Efficient Android Malware Detection without Compromising Performance
- URL: http://arxiv.org/abs/2506.23314v1
- Date: Sun, 29 Jun 2025 16:12:41 GMT
- Title: Interpretable by Design: MH-AutoML for Transparent and Efficient Android Malware Detection without Compromising Performance
- Authors: Joner Assolin, Gabriel Canto, Diego Kreutz, Eduardo Feitosa, Hendrio Bragança, Angelo Nogueira, Vanderson Rocha,
- Abstract summary: Malware detection in Android systems requires both cybersecurity expertise and machine learning (ML) techniques.<n>We present MH-AutoML, a domain-specific framework for Android malware detection.<n>Results show MH-AutoML achieves better recall rates while providing more transparency and control.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Malware detection in Android systems requires both cybersecurity expertise and machine learning (ML) techniques. Automated Machine Learning (AutoML) has emerged as an approach to simplify ML development by reducing the need for specialized knowledge. However, current AutoML solutions typically operate as black-box systems with limited transparency, interpretability, and experiment traceability. To address these limitations, we present MH-AutoML, a domain-specific framework for Android malware detection. MH-AutoML automates the entire ML pipeline, including data preprocessing, feature engineering, algorithm selection, and hyperparameter tuning. The framework incorporates capabilities for interpretability, debugging, and experiment tracking that are often missing in general-purpose solutions. In this study, we compare MH-AutoML against seven established AutoML frameworks: Auto-Sklearn, AutoGluon, TPOT, HyperGBM, Auto-PyTorch, LightAutoML, and MLJAR. Results show that MH-AutoML achieves better recall rates while providing more transparency and control. The framework maintains computational efficiency comparable to other solutions, making it suitable for cybersecurity applications where both performance and explainability matter.
Related papers
- MLZero: A Multi-Agent System for End-to-end Machine Learning Automation [48.716299953336346]
We introduce MLZero, a novel multi-agent framework powered by Large Language Models (LLMs)<n>A cognitive perception module is first employed, transforming raw multimodal inputs into perceptual context.<n> MLZero demonstrates superior performance on MLE-Bench Lite, outperforming all competitors in both success rate and solution quality.
arXiv Detail & Related papers (2025-05-20T05:20:53Z) - AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs.
Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z) - UniAutoML: A Human-Centered Framework for Unified Discriminative and Generative AutoML with Large Language Models [5.725785427377439]
We introduce UniAutoML, a human-centered AutoML framework that unifies AutoML for both discriminative and generative tasks.
The human-centered design of UniAutoML innovatively features a conversational user interface (CUI) that facilitates natural language interactions.
This design enhances transparency and user control throughout the AutoML training process, allowing users to seamlessly break down or modify the model being trained.
arXiv Detail & Related papers (2024-10-09T17:33:15Z) - AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.<n>Recent works have started exploiting large language models (LLM) to lessen such burden.<n>This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML)<n>VML constrains the parameter space to be human-interpretable natural language.<n>We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z) - Position: A Call to Action for a Human-Centered AutoML Paradigm [83.78883610871867]
Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML)
We argue that a key to unlocking AutoML's full potential lies in addressing the currently underexplored aspect of user interaction with AutoML systems.
arXiv Detail & Related papers (2024-06-05T15:05:24Z) - A Multivocal Literature Review on the Benefits and Limitations of
Automated Machine Learning Tools [9.69672653683112]
We conducted a multivocal literature review, which allowed us to identify 54 sources from the academic literature and 108 sources from the grey literature reporting on AutoML benefits and limitations.
Concerning the benefits, we highlight that AutoML tools can help streamline the core steps of ML.
We highlight several limitations that may represent obstacles to the widespread adoption of AutoML.
arXiv Detail & Related papers (2024-01-21T01:39:39Z) - Automatic Componentwise Boosting: An Interpretable AutoML System [1.1709030738577393]
We propose an AutoML system that constructs an interpretable additive model that can be fitted using a highly scalable componentwise boosting algorithm.
Our system provides tools for easy model interpretation such as visualizing partial effects and pairwise interactions.
Despite its restriction to an interpretable model space, our system is competitive in terms of predictive performance on most data sets.
arXiv Detail & Related papers (2021-09-12T18:34:33Z) - LightAutoML: AutoML Solution for a Large Financial Services Ecosystem [108.09104876115428]
We present an AutoML system called LightAutoML developed for a large European financial services company.
Our framework was piloted and deployed in numerous applications and performed at the level of the experienced data scientists.
arXiv Detail & Related papers (2021-09-03T13:52:32Z) - Man versus Machine: AutoML and Human Experts' Role in Phishing Detection [4.124446337711138]
This paper compares the performances of six well-known, state-of-the-art AutoML frameworks on ten different phishing datasets.
Our results indicate that AutoML-based models are able to outperform manually developed machine learning models in complex classification tasks.
arXiv Detail & Related papers (2021-08-27T09:26:20Z) - Naive Automated Machine Learning -- A Late Baseline for AutoML [0.0]
Automated Machine Learning (AutoML) is the problem of automatically finding the pipeline with the best generalization performance on some given dataset.
We present Naive AutoML, a very simple solution to AutoML that exploits important meta-knowledge about machine learning problems.
arXiv Detail & Related papers (2021-03-18T19:52:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.