Related papers: LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs

LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs

URL: http://arxiv.org/abs/2505.24451v1
Date: Fri, 30 May 2025 10:37:14 GMT
Title: LPASS: Linear Probes as Stepping Stones for vulnerability detection using compressed LLMs
Authors: Luis Ibanez-Lissen, Lorena Gonzalez-Manzano, Jose Maria de Fuentes, Nicolas Anciaux,
Abstract summary: We show how Linear Probes can be used to provide an estimation on the performance of a compressed large language model.<n>We also show their suitability to set the cut-off point when applying layer pruning compression.<n>Our approach, dubbed $LPASS$, is applied in BERT and Gemma for the detection of 12 of MITRE's Top 25 most dangerous vulnerabilities on 480k C/C++ samples.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) are being extensively used for cybersecurity purposes. One of them is the detection of vulnerable codes. For the sake of efficiency and effectiveness, compression and fine-tuning techniques are being developed, respectively. However, they involve spending substantial computational efforts. In this vein, we analyse how Linear Probes (LPs) can be used to provide an estimation on the performance of a compressed LLM at an early phase -- before fine-tuning. We also show their suitability to set the cut-off point when applying layer pruning compression. Our approach, dubbed $LPASS$, is applied in BERT and Gemma for the detection of 12 of MITRE's Top 25 most dangerous vulnerabilities on 480k C/C++ samples. LPs can be computed in 142.97 s. and provide key findings: (1) 33.3 \% and 72.2\% of layers can be removed, respectively, with no precision loss; (2) they provide an early estimate of the post-fine-tuning and post-compression model effectiveness, with 3\% and 8.68\% as the lowest and average precision errors, respectively. $LPASS$-based LLMs outperform the state of the art, reaching 86.9\% of accuracy in multi-class vulnerability detection. Interestingly, $LPASS$-based compressed versions of Gemma outperform the original ones by 1.6\% of F1-score at a maximum while saving 29.4 \% and 23.8\% of training and inference time and 42.98\% of model size.

Related papers

Efficient Malware Detection with Optimized Learning on High-Dimensional Features [1.3654846342364308]
Malware detection using machine learning requires feature extraction from binary files.<n>A common approach involves using LIEF for raw feature extraction and the EMBER vectorizer to generate 2381-dimensional feature vectors.<n>This study addresses these challenges by applying two dimensionality reduction techniques.
arXiv Detail & Related papers (2025-06-18T06:56:59Z)
SDMPrune: Self-Distillation MLP Pruning for Efficient Large Language Models [3.962074007736394]
We introduce a self-distillation loss during the pruning phase (rather than post-training) to fully exploit the predictions of the original model.<n>We demonstrate that our method significantly outperforms existing pruning methods.<n>Our method achieves very competitive performance among 1B-scale open source LLMs.
arXiv Detail & Related papers (2025-06-10T02:24:32Z)
Certified Robustness Under Bounded Levenshtein Distance [55.54271307451233]
We propose the first method for computing the Lipschitz constant of convolutional classifiers with respect to the Levenshtein distance.<n>Our method, LipsLev, is able to obtain $38.80$% and $13.93$% verified accuracy at distance $1$ and $2$ respectively.
arXiv Detail & Related papers (2025-01-23T13:58:53Z)
Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection [6.269725911814401]
Large language models (LLMs) are becoming a popular tool as they have significantly advanced in their capability to tackle a wide range of language-based tasks. However, LLMs applications are highly vulnerable to prompt injection attacks, which poses a critical problem. This project explores the security vulnerabilities in relation to prompt injection attacks.
arXiv Detail & Related papers (2024-10-28T00:36:21Z)
Semantic-guided Search for Efficient Program Repair with Large Language Models [0.9319432628663639]
FLAMES employs semantic-guided patch generation to enhance repair effectiveness and memory efficiency. FLAMES substantially reduces memory consumption by up to 83% compared to conventional LLM-based APR. Remarkably, FLAMES successfully generated 133 and 103 correct fixes for 333 and 163 bugs in the Defects4J and HumanEval-Java datasets.
arXiv Detail & Related papers (2024-10-22T02:59:47Z)
Does the Vulnerability Threaten Our Projects? Automated Vulnerable API Detection for Third-Party Libraries [11.012017507408078]
We propose VAScanner, which can effectively identify vulnerable root methods causing vulnerabilities in TPLs. VAScanner eliminates 5.78% false positives and 2.16% false negatives owing to the proposed sifting and augmentation mechanisms. In a large-scale analysis of 3,147 projects using vulnerable TPLs, we find only 21.51% of projects were threatened by vulnerable APIs.
arXiv Detail & Related papers (2024-09-04T14:31:16Z)
Exploring RAG-based Vulnerability Augmentation with LLMs [19.45598962972431]
VulScribeR is a novel solution that leverages carefully curated prompt templates to augment vulnerable datasets.<n>Our approach beats two SOTA methods Vulgen and VGX, and Random Oversampling (ROS) by 27.48%, 27.93%, and 15.41% in f1-score with 5K generated vulnerable samples on average.
arXiv Detail & Related papers (2024-08-07T23:22:58Z)
Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient [57.9629676017527]
We propose an optimization-based structural pruning on Large-Language Models. We learn the pruning masks in a probabilistic space directly by optimizing the loss of the pruned model. Our method operates for 2.7 hours with around 35GB memory for the 13B models on a single A100 GPU.
arXiv Detail & Related papers (2024-06-15T09:31:03Z)
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs [53.31402059062365]
BiLLM is a groundbreaking 1-bit post-training quantization scheme tailored for pretrained large language models. It achieves for the first time high-accuracy inference (e.g. 8.41 perplexity on LLaMA2-70B) with only 1.08-bit weights across various LLMs families.
arXiv Detail & Related papers (2024-02-06T09:26:34Z)
From PEFT to DEFT: Parameter Efficient Finetuning for Reducing Activation Density in Transformers [52.199303258423306]
We propose a novel density loss that encourages higher activation sparsity in pre-trained models. Our proposed method, textbfDEFT, can consistently reduce activation density by up to textbf44.94% on RoBERTa$_mathrmLarge$ and by textbf53.19% (encoder density) and textbf90.60% (decoder density) on Flan-T5$_mathrmXXL$.
arXiv Detail & Related papers (2024-02-02T21:25:46Z)
Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning [79.43940012723539]
ADCLR is a self-supervised learning framework for learning accurate and dense vision representation. Our approach achieves new state-of-the-art performance for contrastive methods.
arXiv Detail & Related papers (2023-06-23T07:38:09Z)
Industrial Anomaly Detection and Localization Using Weakly-Supervised Residual Transformers [44.344548601242444]
We introduce a novel framework, Weakly-supervised RESidual Transformer (WeakREST), to achieve high anomaly detection accuracy.<n>We reformulate the pixel-wise anomaly localization task into a block-wise classification problem.<n>We develop a novel ResMixMatch algorithm, capable of handling the interplay between weak labels and residual-based representations.
arXiv Detail & Related papers (2023-06-06T08:19:30Z)
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression [76.73007709690306]
We introduce the Sparse-Quantized Representation (SpQR), a new compressed format and quantization technique. SpQR achieves relative accuracy losses of less than 1% in perplexity for highly-accurate LLaMA and Falcon LLMs. This makes it possible to run 33B parameter LLM on a single 24 GB consumer GPU without any performance degradation at 15% speedup.
arXiv Detail & Related papers (2023-06-05T17:53:28Z)
LoRAPrune: Structured Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning [56.88751562302793]
Low-rank adaption (LoRA) has emerged to fine-tune large language models (LLMs) LoRAPrune is a new framework that delivers an accurate structured pruned model in a highly memory-efficient manner. LoRAPrune achieves a reduction in perplexity by 4.81 on WikiText2 and 3.46 on PTB, while also decreasing memory usage by 52.6%.
arXiv Detail & Related papers (2023-05-28T15:15:48Z)
Gradient-Free Structured Pruning with Unlabeled Data [57.999191898036706]
We propose a gradient-free structured pruning framework that uses only unlabeled data. Up to 40% of the original FLOP count can be reduced with less than a 4% accuracy loss across all tasks considered.
arXiv Detail & Related papers (2023-03-07T19:12:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.