Data Race Detection by Digest-Driven Abstract Interpretation (Extended Version)
- URL: http://arxiv.org/abs/2511.11055v1
- Date: Fri, 14 Nov 2025 08:11:31 GMT
- Title: Data Race Detection by Digest-Driven Abstract Interpretation (Extended Version)
- Authors: Michael Schwarz, Julian Erhard,
- Abstract summary: Sound static analysis can prove the absence of data races by establishing that no two conflicting memory accesses can occur at the time.<n>We use digests to capture the conditions under which conflicting accesses may not happen in parallel.<n>We report on our implementation of digest-driven data race detection in the static analyzer Goblint, and evaluate it on the SV-COMP benchmark suite.
- Score: 4.3994959886619185
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sound static analysis can prove the absence of data races by establishing that no two conflicting memory accesses can occur at the same time. We repurpose the concept of digests -- summaries of computational histories originally introduced to bring tunable concurrency-sensitivity to thread-modular value analysis by abstract interpretation, extending this idea to race detection: We use digests to capture the conditions under which conflicting accesses may not happen in parallel. To formalize this, we give a definition of data races in the thread-modular local trace semantics and show how exclusion criteria for potential conflicts can be expressed as digests. We report on our implementation of digest-driven data race detection in the static analyzer Goblint, and evaluate it on the SV-COMP benchmark suite. Combining the lockset digest with digests reasoning on thread ids and thread joins increases the number of correctly solved tasks by more than a factor of five compared to lockset reasoning alone.
Related papers
- Compiling Away the Overhead of Race Detection [4.072903728718951]
Dynamic data race detectors are indispensable for flagging errors in software, but their high runtime overhead limits their adoption.<n>We introduce a suite of interprocedural static analyses to eliminate instrumentation for provably race-free accesses.<n>Our approach significantly reduces race detection overhead, achieving a geomean speedup of 1.34x, with peak speedups reaching 2.5x under high thread contention.
arXiv Detail & Related papers (2025-12-05T09:26:08Z) - Context-Aware Search and Retrieval Over Erasure Channels [12.794591022795355]
We present an information-theoretic analysis of a remote document retrieval system operating over a symbol erasure channel.<n>The proposed model encodes the feature vector of a query, derived from term-frequency weights of a language corpus.<n>We derive an explicit expression for the retrieval error probability, i.e., the probability under which the less similar document is selected.
arXiv Detail & Related papers (2025-07-16T04:21:46Z) - Empirical Evaluation of Embedding Models in the Context of Text Classification in Document Review in Construction Delay Disputes [6.076874513889027]
Text embeddings are numerical representations of text data, where words, phrases, or entire documents are converted into vectors of real numbers.<n>This paper presents our work in evaluating different embeddings through a comprehensive comparative analysis of four distinct models.<n>We employ both K-Nearest Neighbors (KNN) and Logistic Regression (LR) to perform binary classification tasks, specifically determining whether a text snippet is associated with 'delay' or 'not delay' within a labeled dataset.
arXiv Detail & Related papers (2025-01-16T22:12:11Z) - Intelligent Multi-Document Summarisation for Extracting Insights on Racial Inequalities from Maternity Incident Investigation Reports [0.8609957371651683]
In healthcare, thousands of safety incidents occur every year, but learning from these incidents is not effectively aggregated.
This paper presents I-SIRch:CS, a framework designed to facilitate the aggregation and analysis of safety incident reports.
The framework integrates concept annotation using the Safety Intelligence Research (SIRch) taxonomy with clustering, summarisation, and analysis capabilities.
arXiv Detail & Related papers (2024-07-11T09:11:20Z) - AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation [57.8363998797433]
We propose AMRFact, a framework that generates perturbed summaries using Abstract Meaning Representations (AMRs)
Our approach parses factually consistent summaries into AMR graphs and injects controlled factual inconsistencies to create negative examples, allowing for coherent factually inconsistent summaries to be generated with high error-type coverage.
arXiv Detail & Related papers (2023-11-16T02:56:29Z) - Interpretable Automatic Fine-grained Inconsistency Detection in Text
Summarization [56.94741578760294]
We propose the task of fine-grained inconsistency detection, the goal of which is to predict the fine-grained types of factual errors in a summary.
Motivated by how humans inspect factual inconsistency in summaries, we propose an interpretable fine-grained inconsistency detection model, FineGrainFact.
arXiv Detail & Related papers (2023-05-23T22:11:47Z) - Understanding and Mitigating Spurious Correlations in Text
Classification with Neighborhood Analysis [69.07674653828565]
Machine learning models have a tendency to leverage spurious correlations that exist in the training set but may not hold true in general circumstances.
In this paper, we examine the implications of spurious correlations through a novel perspective called neighborhood analysis.
We propose a family of regularization methods, NFL (doN't Forget your Language) to mitigate spurious correlations in text classification.
arXiv Detail & Related papers (2023-05-23T03:55:50Z) - BERM: Training the Balanced and Extractable Representation for Matching
to Improve Generalization Ability of Dense Retrieval [54.66399120084227]
We propose a novel method to improve the generalization of dense retrieval via capturing matching signal called BERM.
Dense retrieval has shown promise in the first-stage retrieval process when trained on in-domain labeled datasets.
arXiv Detail & Related papers (2023-05-18T15:43:09Z) - Visualizing Classifier Adjacency Relations: A Case Study in Speaker
Verification and Voice Anti-Spoofing [72.4445825335561]
We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers.
Based upon rank correlations, our method facilitates a visual comparison of classifiers with arbitrary scores.
While the approach is fully versatile and can be applied to any detection task, we demonstrate the method using scores produced by automatic speaker verification and voice anti-spoofing systems.
arXiv Detail & Related papers (2021-06-11T13:03:33Z) - Relation Clustering in Narrative Knowledge Graphs [71.98234178455398]
relational sentences in the original text are embedded (with SBERT) and clustered in order to merge together semantically similar relations.
Preliminary tests show that such clustering might successfully detect similar relations, and provide a valuable preprocessing for semi-supervised approaches.
arXiv Detail & Related papers (2020-11-27T10:43:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.