Related papers: BLAFS: A Bloat Aware File System

Related papers

The Future is Sparse: Embedding Compression for Scalable Retrieval in Recommender Systems [3.034710104407876]
We describe a lightweight, learnable embedding compression technique that projects dense embeddings into a high-dimensional, sparsely activated space.<n>Our results demonstrate that leveraging sparsity is a promising approach for improving the efficiency of large-scale recommenders.
arXiv Detail & Related papers (2025-05-16T15:51:52Z)
The Hidden Bloat in Machine Learning Systems [0.22099217573031676]
Software bloat refers to code and features that is not used by a software during runtime. For Machine Learning (ML) systems, bloat is a major contributor to their technical debt leading to decreased performance and resource wastage. We present Negativa-ML, a novel tool to identify and remove bloat in ML frameworks by analyzing their shared libraries.
arXiv Detail & Related papers (2025-03-18T13:04:25Z)
QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation [84.91431271257437]
Diffusion Transformers (DiTs) have emerged as a dominant architecture in video generation. DiTs come with significant drawbacks, including increased computational and memory costs. We propose QuantCache, a novel training-free inference acceleration framework.
arXiv Detail & Related papers (2025-03-09T10:31:51Z)
TEEMATE: Fast and Efficient Confidential Container using Shared Enclave [17.032423912089854]
We introduce TeeMate, a new approach to utilize the enclaves on the host system. We show that TeeMate achieves at least 4.5 times lower latency and 2.8 times lower memory usage compared to the applications built on the conventional confidential containers.
arXiv Detail & Related papers (2024-11-18T09:50:20Z)
I Know What You Sync: Covert and Side Channel Attacks on File Systems via syncfs [5.556839719025154]
We show new types of side channels through the file system that break logical isolation. The file system plays a critical role in the operating system, managing all I/O activities between the application layer and the physical storage device. We construct three side-channel attacks targeting both Linux and Android devices.
arXiv Detail & Related papers (2024-11-16T20:40:08Z)
BitStack: Any-Size Compression of Large Language Models in Variable Memory Environments [53.71158537264695]
Large language models (LLMs) have revolutionized numerous applications, yet their deployment remains challenged by memory constraints on local devices. We introduce textbfBitStack, a novel, training-free weight compression approach that enables megabyte-level trade-offs between memory usage and model performance.
arXiv Detail & Related papers (2024-10-31T13:26:11Z)
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss [59.835032408496545]
We propose a tile-based strategy that partitions the contrastive loss calculation into arbitrary small blocks. We also introduce a multi-level tiling strategy to leverage the hierarchical structure of distributed systems. Compared to SOTA memory-efficient solutions, it achieves a two-order-of-magnitude reduction in memory while maintaining comparable speed.
arXiv Detail & Related papers (2024-10-22T17:59:30Z)
SafeBPF: Hardware-assisted Defense-in-depth for eBPF Kernel Extensions [1.0499611180329806]
We introduce SafeBPF, a general design that isolates eBPF programs from the rest of the kernel to prevent memory safety vulnerabilities from being exploited. We show that SafeBPF incurs up to 4% overhead on macrobenchmarks while achieving desired security properties.
arXiv Detail & Related papers (2024-09-11T13:58:51Z)
The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition. Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies. We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z)
Automatic Jailbreaking of the Text-to-Image Generative AI Systems [76.9697122883554]
We study the safety of the commercial T2I generation systems, such as ChatGPT, Copilot, and Gemini, on copyright infringement with naive prompts. We propose a stronger automated jailbreaking pipeline for T2I generation systems, which produces prompts that bypass their safety guards. Our framework successfully jailbreaks the ChatGPT with 11.0% block rate, making it generate copyrighted contents in 76% of the time.
arXiv Detail & Related papers (2024-05-26T13:32:24Z)
LUCID: A Framework for Reducing False Positives and Inconsistencies Among Container Scanning Tools [0.0]
This paper provides a fully functional framework named LUCID that can reduce false positives and inconsistencies provided by multiple scanning tools. Our results show that our framework can reduce inconsistencies by 70%. We also create a Dynamic Classification component that can successfully classify and predict the different severity levels with an accuracy of 84%.
arXiv Detail & Related papers (2024-05-11T16:58:28Z)
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation [54.28841287750586]
Large language models (LLMs) have demonstrated outstanding performance in various tasks, such as text summarization, text question-answering, and etc. Existing solutions such as SparseGPT and Wanda attempt to alleviate this issue through weight pruning. This paper introduces a novel LLM pruning technique dubbed blockwise parameter-efficient sparsity allocation (BESA) by applying a blockwise reconstruction loss.
arXiv Detail & Related papers (2024-02-18T12:44:15Z)
Improving Program Debloating with 1-DU Chain Minimality [47.73151075716047]
We present RLDebloatDU, an innovative debloating technique that employs 1-DU chain minimality within abstract syntax trees. Our approach maintains essential program data dependencies, striking a balance between aggressive code reduction and the preservation of program semantics.
arXiv Detail & Related papers (2024-02-01T02:00:32Z)
Exploiting Kubernetes' Image Pull Implementation to Deny Node Availability [0.0]
Application Programming Interface (API) interactions between K8s and its runtime interfaces have not been studied thoroughly. CRI-API is responsible for abstracting the container runtime, managing the creation and lifecycle of containers along with the downloads of the respective images. We show that such attacks can generate up to 95% average CPU usage, prevent downloading new container images, and increase I/O and network usage for a potentially unlimited amount of time.
arXiv Detail & Related papers (2024-01-19T09:49:53Z)
A Broad Comparative Evaluation of Software Debloating Tools [3.0913520619484287]
Software debloating tools seek to improve program security and performance by removing unnecessary code, called bloat. We surveyed 10 years of debloating literature and several tools currently under commercial development to taxonomize knowledge about the debloating ecosystem. Our evaluation, conducted on a diverse set of 20 benchmark programs, measures tools across 12 performance, security, and correctness metrics.
arXiv Detail & Related papers (2023-12-20T18:53:18Z)
Charliecloud's layer-free, Git-based container build cache [0.0]
This image is built by interpreting instructions in a machine-readable recipe, which is faster with a build cache that stores instruction results for re-use. Standard approach is a many-layered union, encoding differences between layers as tar archives. Our experiments show this performs similarly to layered caches on both build time and disk usage, with a considerable advantage for many-instruction recipes.
arXiv Detail & Related papers (2023-08-31T23:05:16Z)
On the Security Blind Spots of Software Composition Analysis [46.1389163921338]
We present a novel approach to detect vulnerable clones in the Maven repository. We retrieve over 53k potential vulnerable clones from Maven Central. We detect 727 confirmed vulnerable clones and synthesize a testable proof-of-vulnerability project for each of those.
arXiv Detail & Related papers (2023-06-08T20:14:46Z)
Online Continual Learning Without the Storage Constraint [67.66235695269839]
We contribute a simple algorithm, which updates a kNN classifier continually along with a fixed, pretrained feature extractor. It can adapt to rapidly changing streams, has zero stability gap, operates within tiny computational budgets, has low storage requirements by only storing features. It can outperform existing methods by over 20% in accuracy on two large-scale online continual learning datasets.
arXiv Detail & Related papers (2023-05-16T08:03:07Z)
SeiT: Storage-Efficient Vision Training with Tokens Using 1% of Pixel Storage [52.317406324182215]
We propose a storage-efficient training strategy for vision classifiers for large-scale datasets. Our token storage only needs 1% of the original JPEG-compressed raw pixels. Our experimental results on ImageNet-1k show that our method significantly outperforms other storage-efficient training methods with a large gap.
arXiv Detail & Related papers (2023-03-20T13:55:35Z)
Rediscovering Hashed Random Projections for Efficient Quantization of Contextualized Sentence Embeddings [113.38884267189871]
Training and inference on edge devices often requires an efficient setup due to computational limitations. Pre-computing data representations and caching them on a server can mitigate extensive edge device computation. We propose a simple, yet effective approach that uses randomly hyperplane projections. We show that the embeddings remain effective for training models across various English and German sentence classification tasks that retain 94%--99% of their floating-point.
arXiv Detail & Related papers (2023-03-13T10:53:00Z)
BAFFLE: A Baseline of Backpropagation-Free Federated Learning [71.09425114547055]
Federated learning (FL) is a general principle for decentralized clients to train a server model collectively without sharing local data. We develop backpropagation-free federated learning, dubbed BAFFLE, in which backpropagation is replaced by multiple forward processes to estimate gradients. BAFFLE is 1) memory-efficient and easily fits uploading bandwidth; 2) compatible with inference-only hardware optimization and model quantization or pruning; and 3) well-suited to trusted execution environments.
arXiv Detail & Related papers (2023-01-28T13:34:36Z)
Machine Learning Systems are Bloated and Vulnerable [2.7023370929727277]
We develop MMLB, a framework for analyzing bloat in software systems. MMLB measures the amount of bloat at both the container and package levels. We show that bloat accounts for up to 80% of machine learning container sizes.
arXiv Detail & Related papers (2022-12-16T10:34:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.