It's not Easy: Applying Supervised Machine Learning to Detect Malicious Extensions in the Chrome Web Store
- URL: http://arxiv.org/abs/2509.21590v2
- Date: Thu, 02 Oct 2025 13:02:47 GMT
- Title: It's not Easy: Applying Supervised Machine Learning to Detect Malicious Extensions in the Chrome Web Store
- Authors: Ben Rosenzweig, Valentino Dalla Valle, Giovanni Apruzzese, Aurore Fass,
- Abstract summary: Most well-known marketplace of such extensions is the Chrome Web Store (CWS)<n>Such extensions are made available to users only after a vetting process carried out by Google itself.<n>Here, we scrutinize the extent to which automated mechanisms reliant on supervised machine learning (ML) can be used to detect malicious extensions on the CWS.
- Score: 4.229843361218578
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Google Chrome is the most popular Web browser. Users can customize it with extensions that enhance their browsing experience. The most well-known marketplace of such extensions is the Chrome Web Store (CWS). Developers can upload their extensions on the CWS, but such extensions are made available to users only after a vetting process carried out by Google itself. Unfortunately, some malicious extensions bypass such checks, putting the security and privacy of downstream browser extension users at risk. Here, we scrutinize the extent to which automated mechanisms reliant on supervised machine learning (ML) can be used to detect malicious extensions on the CWS. To this end, we first collect 7,140 malicious extensions published in 2017--2023. We combine this dataset with 63,598 benign extensions published or updated on the CWS before 2023, and we develop three supervised-ML-based classifiers. We show that, in a "lab setting", our classifiers work well (e.g., 98% accuracy). Then, we collect a more recent set of 35,462 extensions from the CWS, published or last updated in 2023, with unknown ground truth. We were eventually able to identify 68 malicious extensions that bypassed the vetting process of the CWS. However, our classifiers also reported >1k likely malicious extensions. Based on this finding (further supported with empirical evidence), we elucidate, for the first time, a strong concept drift effect on browser extensions. We also show that commercial detectors (e.g., VirusTotal) work poorly to detect known malicious extensions. Altogether, our results highlight that detecting malicious browser extensions is a fundamentally hard problem. This requires additional work both by the research community and by Google itself -- potentially by revising their approaches. In the meantime, we informed Google of our discoveries, and we release our artifacts.
Related papers
- MalTool: Malicious Tool Attacks on LLM Agents [52.01975462609959]
MalTool is a coding-LLM-based framework that synthesizes tools exhibiting specified malicious behaviors.<n>We show that MalTool is highly effective even when coding LLMs are safety-aligned.
arXiv Detail & Related papers (2026-02-12T17:27:43Z) - Malicious GenAI Chrome Extensions: Unpacking Data Exfiltration and Malicious Behaviours [2.624097337766623]
Cybercriminals are exploiting the rapid proliferation of AI and GenAI tools in the Chrome Web Store.<n>They are deploying malicious Chrome extensions posing as AI tools or impersonating popular GenAI models to target users.<n>We curated a dataset of 5,551 AI-themed extensions released over a nine-month period to the Chrome Web Store.
arXiv Detail & Related papers (2025-12-10T19:33:58Z) - Developers Insight On Manifest v3 Privacy and Security Webextensions [0.0]
Currently, Chrome transitions to a modified set of APIs called Manifest v3.<n>This paper studies the challenges and opportunities of Manifest v3 with an in-depth structured qualitative research.
arXiv Detail & Related papers (2025-07-18T14:00:16Z) - OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents [60.78202583483591]
We introduce OS-Harm, a new benchmark for measuring safety of computer use agents.<n> OS-Harm is built on top of the OSWorld environment and aims to test models across three categories of harm: deliberate user misuse, prompt injection attacks, and model misbehavior.<n>We evaluate computer use agents based on a range of frontier models and provide insights into their safety.
arXiv Detail & Related papers (2025-06-17T17:59:31Z) - A Study on Malicious Browser Extensions in 2025 [0.3749861135832073]
This paper examines the evolving threat landscape of malicious browser extensions in 2025, focusing on Mozilla Firefox and Chrome.<n>Our research successfully bypassed security mechanisms of Firefox and Chrome, demonstrating that malicious extensions can still be developed, published, and executed within the Mozilla Add-ons Store and Chrome Web Store.
arXiv Detail & Related papers (2025-03-06T10:24:27Z) - Living off the Analyst: Harvesting Features from Yara Rules for Malware Detection [50.55317257140427]
A strategy used by malicious actors is to "live off the land," where benign systems are used and repurposed for the malicious actor's intent.<n>We show that this is plausible via YARA rules, which use human-written signatures to detect specific malware families.<n>By extracting sub-signatures from publicly available YARA rules, we assembled a set of features that can more effectively discriminate malicious samples.
arXiv Detail & Related papers (2024-11-27T17:03:00Z) - What is in the Chrome Web Store? Investigating Security-Noteworthy Browser Extensions [1.2499537119440243]
This paper is the first attempt at providing a holistic view of the Chrome Web Store (CWS)
We leverage historical data provided by ChromeStats to study global trends in the CWS and security implications.
arXiv Detail & Related papers (2024-06-18T15:25:06Z) - Did I Vet You Before? Assessing the Chrome Web Store Vetting Process through Browser Extension Similarity [3.7980955101286322]
We characterize the prevalence of malware and other infringing extensions in the Chrome Web Store (CWS), the largest distribution platform for this type of software.
Our study reveals significant gaps in the CWS vetting process, as 86% of infringing extensions are extremely similar to previously vetted items.
Our study also reveals that only 1% of malware extensions flagged by the CWS are detected as malicious by anti-malware engines.
arXiv Detail & Related papers (2024-06-01T09:17:01Z) - FV8: A Forced Execution JavaScript Engine for Detecting Evasive Techniques [53.288368877654705]
FV8 is a modified V8 JavaScript engine designed to identify evasion techniques in JavaScript code.
It selectively enforces code execution on APIs that conditionally inject dynamic code.
It identifies 1,443 npm packages and 164 (82%) extensions containing at least one type of evasion.
arXiv Detail & Related papers (2024-05-21T19:54:19Z) - Manifest V3 Unveiled: Navigating the New Era of Browser Extensions [53.288368877654705]
In 2020, Google announced a shift in extension development with Manifest Version 3 (V3), aiming to replace the previous Version 2 (V2) by January 2023.
This paper presents a comprehensive analysis of the Manifest V3 ecosystem.
arXiv Detail & Related papers (2024-04-12T08:09:26Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Adversarial EXEmples: A Survey and Experimental Evaluation of Practical
Attacks on Machine Learning for Windows Malware Detection [67.53296659361598]
adversarial EXEmples can bypass machine learning-based detection by perturbing relatively few input bytes.
We develop a unifying framework that does not only encompass and generalize previous attacks against machine-learning models, but also includes three novel attacks.
These attacks, named Full DOS, Extend and Shift, inject the adversarial payload by respectively manipulating the DOS header, extending it, and shifting the content of the first section.
arXiv Detail & Related papers (2020-08-17T07:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.