Related papers: GPTrace: Effective Crash Deduplication Using LLM Embeddings

GPTrace: Effective Crash Deduplication Using LLM Embeddings

URL: http://arxiv.org/abs/2512.01609v1
Date: Mon, 01 Dec 2025 12:30:30 GMT
Title: GPTrace: Effective Crash Deduplication Using LLM Embeddings
Authors: Patrick Herter, Vincent Ahlrichs, Ridvan Açilan, Julian Horsch,
Abstract summary: Crash deduplication is the task of finding duplicate crashing inputs and thereby reducing the data that needs to be examined.<n>We present GPTrace, a deduplication workflow that leverages a large language model to evaluate the similarity of various data sources associated with crashes.<n>We evaluate our approach on over 300 000 crashing inputs belonging to 50 ground truth labels from 14 different targets.
Score: 0.8166364251367626
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fuzzing is a highly effective method for uncovering software vulnerabilities, but analyzing the resulting data typically requires substantial manual effort. This is amplified by the fact that fuzzing campaigns often find a large number of crashing inputs, many of which share the same underlying bug. Crash deduplication is the task of finding such duplicate crashing inputs and thereby reducing the data that needs to be examined. Many existing deduplication approaches rely on comparing stack traces or other information that is collected when a program crashes. Although various metrics for measuring the similarity of such pieces of information have been proposed, many do not yield satisfactory deduplication results. In this work, we present GPTrace, a deduplication workflow that leverages a large language model to evaluate the similarity of various data sources associated with crashes by computing embedding vectors and supplying those as input to a clustering algorithm. We evaluate our approach on over 300 000 crashing inputs belonging to 50 ground truth labels from 14 different targets. The deduplication results produced by GPTrace show a noticeable improvement over hand-crafted stack trace comparison methods and even more complex state-of-the-art approaches that are less flexible.

Related papers

Stack Trace-Based Crash Deduplication with Transformer Adaptation [2.846561253333858]
Automated crash reporting systems generate large volumes of duplicate reports.<n>Traditional stack trace-based deduplication methods fail to capture contextual and structural relationships within stack traces.<n>We propose dedupT, a transformer-based approach that models stack traces holistically rather than as isolated frames.
arXiv Detail & Related papers (2025-08-26T21:51:10Z)
On the Feasibility of Deduplicating Compiler Bugs with Bisection [1.286741686995463]
Bug deduplication is a practical research problem known as bug deduplication.<n>Prior methodologies for compiler bug deduplication primarily rely on program analysis to extract bug-related features for duplicate identification.<n>We introduce BugLens, a novel deduplication method that primarily uses bisection, enhanced by the identification of bug-triggering optimizations to minimize false negatives.
arXiv Detail & Related papers (2025-06-29T15:12:57Z)
Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios [42.75968139336785]
In large-scale software systems, there are often no fully-fledged bug reports with human-written descriptions when an error occurs.<n>In this case, developers rely on stack traces, i.e., series of function calls that led to the error.<n>Recent works have proposed powerful deep learning-based approaches for this, but they are evaluated and compared in isolation from real-life categories.
arXiv Detail & Related papers (2024-12-19T12:48:17Z)
Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain. We propose an adversarial algorithm to make the retriever component robust against distribution shift. We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z)
Simple Ingredients for Offline Reinforcement Learning [86.1988266277766]
offline reinforcement learning algorithms have proven effective on datasets highly connected to the target downstream task. We show that existing methods struggle with diverse data: their performance considerably deteriorates as data collected for related but different tasks is simply added to the offline buffer. We show that scale, more than algorithmic considerations, is the key factor influencing performance.
arXiv Detail & Related papers (2024-03-19T18:57:53Z)
Rethinking Negative Pairs in Code Search [56.23857828689406]
We propose a simple yet effective Soft-InfoNCE loss that inserts weight terms into InfoNCE. We analyze the effects of Soft-InfoNCE on controlling the distribution of learnt code representations and on deducing a more precise mutual information estimation.
arXiv Detail & Related papers (2023-10-12T06:32:42Z)
Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts. We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data. We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z)
S3M: Siamese Stack (Trace) Similarity Measure [55.58269472099399]
We present S3M -- the first approach to computing stack trace similarity based on deep learning. It is based on a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset.
arXiv Detail & Related papers (2021-03-18T21:10:41Z)
Hard-label Manifolds: Unexpected Advantages of Query Efficiency for Finding On-manifold Adversarial Examples [67.23103682776049]
Recent zeroth order hard-label attacks on image classification models have shown comparable performance to their first-order, gradient-level alternatives. It was recently shown in the gradient-level setting that regular adversarial examples leave the data manifold, while their on-manifold counterparts are in fact generalization errors. We propose an information-theoretic argument based on a noisy manifold distance oracle, which leaks manifold information through the adversary's gradient estimate.
arXiv Detail & Related papers (2021-03-04T20:53:06Z)
Establishing strong imputation performance of a denoising autoencoder in a wide range of missing data problems [0.0]
We develop a consistent framework for both training and imputation. We benchmarked the results against state-of-the-art imputation methods. The developed autoencoder obtained the smallest error for all ranges of initial data corruption.
arXiv Detail & Related papers (2020-04-06T12:00:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.