Unbundle-Rewrite-Rebundle: Runtime Detection and Rewriting of Privacy-Harming Code in JavaScript Bundles
- URL: http://arxiv.org/abs/2405.00596v2
- Date: Tue, 7 May 2024 15:38:20 GMT
- Title: Unbundle-Rewrite-Rebundle: Runtime Detection and Rewriting of Privacy-Harming Code in JavaScript Bundles
- Authors: Mir Masood Ali, Peter Snyder, Chris Kanich, Hamed Haddadi,
- Abstract summary: Unbundle-Rewrite-Rebundle (URR) is a system for detecting privacy-harming portions of bundled JavaScript code.
URR rewrites that code at runtime to remove the privacy harming behavior without breaking the surrounding code or overall application.
- Score: 11.832746335723437
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work presents Unbundle-Rewrite-Rebundle (URR), a system for detecting privacy-harming portions of bundled JavaScript code, and rewriting that code at runtime to remove the privacy harming behavior without breaking the surrounding code or overall application. URR is a novel solution to the problem of JavaScript bundles, where websites pre-compile multiple code units into a single file, making it impossible for content filters and ad-blockers to differentiate between desired and unwanted resources. Where traditional content filtering tools rely on URLs, URR analyzes the code at the AST level, and replaces harmful AST sub-trees with privacy-and-functionality maintaining alternatives. We present an open-sourced implementation of URR as a Firefox extension, and evaluate it against JavaScript bundles generated by the most popular bundling system (Webpack) deployed on the Tranco 10k. We measure the performance, measured by precision (1.00), recall (0.95), and speed (0.43s per-script) when detecting and rewriting three representative privacy harming libraries often included in JavaScript bundles, and find URR to be an effective approach to a large-and-growing blind spot unaddressed by current privacy tools.
Related papers
- JPS: Jailbreak Multimodal Large Language Models with Collaborative Visual Perturbation and Textual Steering [73.962469626788]
Jailbreak attacks against multimodal large language Models (MLLMs) are a significant research focus.<n>We propose JPS, underlineJailbreak MLLMs with collaborative visual underlinePerturbation and textual underlineSteering.
arXiv Detail & Related papers (2025-08-07T07:14:01Z) - JavaSith: A Client-Side Framework for Analyzing Potentially Malicious Extensions in Browsers, VS Code, and NPM Packages [0.0]
JavaSith is a novel framework for analyzing potentially malicious extensions in web browsers, Visual Studio Code (VSCode), and Node's NPM packages.<n>We present the design and architecture of JavaSith, including techniques for intercepting extension behavior over simulated time.<n>We demonstrate how JavaSith can catch stealthy malicious behaviors that evade traditional detection.
arXiv Detail & Related papers (2025-05-27T14:40:25Z) - An Empirical Study of JavaScript Inclusion Security Issues in Chrome Extensions [0.10878040851638002]
The analysis of 36,324 Chrome extensions revealed 350,784 JavaScript inclusions.<n>Although the majority of these inclusions originate from local files within the extensions, 22 instances of vulnerable remote JavaScript inclusions were identified.<n>These remote inclusions present potential avenues for malicious actors to execute arbitrary code within the extension's execution context.
arXiv Detail & Related papers (2025-05-26T03:22:37Z) - DP-GTR: Differentially Private Prompt Protection via Group Text Rewriting [16.861151219321737]
We introduce DP-GTR, a novel three-stage framework that leverages local differential privacy (DP) and the composition theorem via group text rewriting.
Experiments on CommonSense QA and DocVQA demonstrate that DP-GTR outperforms existing approaches.
Our framework is compatible with existing rewriting techniques, serving as a plug-in to enhance privacy protection.
arXiv Detail & Related papers (2025-03-06T21:39:42Z) - PrivAgent: Agentic-based Red-teaming for LLM Privacy Leakage [78.33839735526769]
LLMs may be fooled into outputting private information under carefully crafted adversarial prompts.
PrivAgent is a novel black-box red-teaming framework for privacy leakage.
arXiv Detail & Related papers (2024-12-07T20:09:01Z) - SCORE: Syntactic Code Representations for Static Script Malware Detection [9.502104012686491]
Server-side script attacks can steal data, compromise credentials, and disrupt operations.
We propose novel feature extraction and deep learning (DL)-based approaches for static script malware detection.
Our approach achieves a true positive rate (TPR) up to 81% higher than leading signature-based antivirus solutions.
arXiv Detail & Related papers (2024-11-12T20:58:04Z) - Chain-of-Experts (CoE): Reverse Engineering Software Bills of Materials for JavaScript Application Bundles through Code Clone Search [5.474149892700497]
A Software Bill of Materials (SBoM) is a detailed inventory of all components, libraries, and modules in a software artifact.
JavaScript application bundles are a consolidated, symbol-stripped, and optimized assembly of code for deployment purpose.
Generating a SBoM from a JavaScript application bundle through a reverse-engineering process ensures the integrity, security, and compliance of the supplier's software release.
arXiv Detail & Related papers (2024-08-29T01:32:49Z) - Blocking Tracking JavaScript at the Function Granularity [15.86649576818013]
Not.js is a fine grained JavaScript blocking tool that operates at the function level granularity.
Not.js trains a supervised machine learning classifier on a webpage's graph representation to first detect tracking at the JavaScript function level.
Not.js then automatically generates surrogate scripts that preserve functionality while removing tracking.
arXiv Detail & Related papers (2024-05-28T17:26:57Z) - FV8: A Forced Execution JavaScript Engine for Detecting Evasive Techniques [53.288368877654705]
FV8 is a modified V8 JavaScript engine designed to identify evasion techniques in JavaScript code.
It selectively enforces code execution on APIs that conditionally inject dynamic code.
It identifies 1,443 npm packages and 164 (82%) extensions containing at least one type of evasion.
arXiv Detail & Related papers (2024-05-21T19:54:19Z) - The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented
Generation (RAG) [56.67603627046346]
Retrieval-augmented generation (RAG) is a powerful technique to facilitate language model with proprietary and private data.
In this work, we conduct empirical studies with novel attack methods, which demonstrate the vulnerability of RAG systems on leaking the private retrieval database.
arXiv Detail & Related papers (2024-02-23T18:35:15Z) - Code-Based Single-Server Private Information Retrieval: Circumventing the Sub-Query Attack [9.054540533394928]
modified version of the first code-based single-server computational PIR scheme proposed by Holzbaur, Hollanti, and Wachter-Zeh.
In the case of retrieving multiple files, the rate of the modified scheme is largely unaffected and at par with the original scheme.
arXiv Detail & Related papers (2024-02-05T10:37:26Z) - Zero-Shot Detection of Machine-Generated Codes [83.0342513054389]
This work proposes a training-free approach for the detection of LLMs-generated codes.
We find that existing training-based or zero-shot text detectors are ineffective in detecting code.
Our method exhibits robustness against revision attacks and generalizes well to Java codes.
arXiv Detail & Related papers (2023-10-08T10:08:21Z) - Adapting the LodView RDF Browser for Navigation over the Multilingual
Linguistic Linked Open Data Cloud [77.34726150561087]
The paper is dedicated to the use of LodView for navigation over the multilingual Linked Open Data cloud.
We define the class of Pubby-like tools that LodView belongs to, and clarify the relation of this class to the classes of dereferenciation tools, RDF browsers and LOD visualization tools.
arXiv Detail & Related papers (2022-08-28T21:47:59Z) - Repro: An Open-Source Library for Improving the Reproducibility and
Usability of Publicly Available Research Code [74.28810048824519]
Repro is an open-source library which aims at improving the usability of research code.
It provides a lightweight Python API for running software released by researchers within Docker containers.
arXiv Detail & Related papers (2022-04-29T01:54:54Z) - Contrastive Code Representation Learning [95.86686147053958]
We show that the popular reconstruction-based BERT model is sensitive to source code edits, even when the edits preserve semantics.
We propose ContraCode: a contrastive pre-training task that learns code functionality, not form.
arXiv Detail & Related papers (2020-07-09T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.