Related papers: Compiling Away the Overhead of Race Detection

Compiling Away the Overhead of Race Detection

URL: http://arxiv.org/abs/2512.05555v1
Date: Fri, 05 Dec 2025 09:26:08 GMT
Title: Compiling Away the Overhead of Race Detection
Authors: Alexey Paznikov, Andrey Kogutenko, Yaroslav Osipov, Michael Schwarz, Umang Mathur,
Abstract summary: Dynamic data race detectors are indispensable for flagging errors in software, but their high runtime overhead limits their adoption.<n>We introduce a suite of interprocedural static analyses to eliminate instrumentation for provably race-free accesses.<n>Our approach significantly reduces race detection overhead, achieving a geomean speedup of 1.34x, with peak speedups reaching 2.5x under high thread contention.
Score: 4.072903728718951
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Dynamic data race detectors are indispensable for flagging concurrency errors in software, but their high runtime overhead limits their adoption. This overhead stems primarily from pervasive instrumentation of memory accesses - a significant fraction of which is redundant. We addresses this inefficiency through a static, compiler-integrated approach that identifies and eliminates redundant instrumentation, drastically reducing the runtime cost of dynamic data race detectors. We introduce a suite of interprocedural static analyses reasoning about memory access patterns, synchronization, and thread creation to eliminate instrumentation for provably race-free accesses and show that the completeness properties of the data race detector are preserved. We further observe that many inserted checks flag a race if and only if a preceding check has already flagged an equivalent race for the same memory location - albeit potentially at a different access. We characterize this notion of equivalence and show that, when limiting reporting to at least one representative for each equivalence class, a further class of redundant checks can be eliminated. We identify such accesses using a novel dominance-based elimination analysis. Based on these two insights, we have implemented five static analyses within the LLVM, integrated with the instrumentation pass of the race detector ThreadSanitizer. Our experimental evaluation on a diverse suite of real-world applications demonstrates that our approach significantly reduces race detection overhead, achieving a geomean speedup of 1.34x, with peak speedups reaching 2.5x under high thread contention. This performance is achieved with a negligible increase in compilation time and, being fully automatic, places no additional burden on developers. Our optimizations have been accepted by the ThreadSanitizer maintainers and are in the process of being upstreamed.

Related papers

Detecting Overflow in Compressed Token Representations for Retrieval-Augmented Generation [49.48204107529758]
We define token overflow as a regime in which compressed representations no longer contain sufficient information to answer a given query.<n>In this paper, we find that query-agnostic saturation statistics reliably separate compressed from uncompressed token representations.<n>Lightweight probing classifiers over both query and context xRAG representations detect overflow with 0.72 AUC-ROC on average.<n>These results advance from query-independent diagnostics to query-aware detectors, enabling low-cost pre-LLM gating to mitigate compression-induced errors.
arXiv Detail & Related papers (2026-02-12T18:15:08Z)
Fast SAM2 with Text-Driven Token Pruning [52.8350457627401]
Segment Anything Model 2 (SAM2), a vision computation model has significantly advanced in prompt-driven video object segmentation.<n>SAM2 pipelines propagate all visual tokens produced by the image encoder through downstream temporal reasoning modules, regardless of their relevance to the target object.<n>We introduce a text-guided token pruning framework that improves inference efficiency by selectively reducing token density prior to temporal propagation.
arXiv Detail & Related papers (2025-12-24T18:59:05Z)
Sequential Testing for Descriptor-Agnostic LiDAR Loop Closure in Repetitive Environments [12.304166871828777]
We propose a multi-frame loop closure verification method that formulates LiDAR loop closure as a truncated Sequential Probability Ratio Test (SPRT)<n>Instead of deciding from a single descriptor comparison or using fixed thresholds with late-stage Iterative Closest Point (ICP) vetting, the verifier accumulates a short temporal stream of descriptor similarities between a query and each candidate.<n>This precision-first policy is designed to suppress false positives in structurally repetitive indoor environments.
arXiv Detail & Related papers (2025-12-10T09:20:09Z)
Data Race Detection by Digest-Driven Abstract Interpretation (Extended Version) [4.3994959886619185]
Sound static analysis can prove the absence of data races by establishing that no two conflicting memory accesses can occur at the time.<n>We use digests to capture the conditions under which conflicting accesses may not happen in parallel.<n>We report on our implementation of digest-driven data race detection in the static analyzer Goblint, and evaluate it on the SV-COMP benchmark suite.
arXiv Detail & Related papers (2025-11-14T08:11:31Z)
vCache: Verified Semantic Prompt Caching [95.16654660556975]
This paper proposes vCache, the first verified semantic cache with user-defined error rate guarantees.<n>It employs an online learning algorithm to estimate an optimal threshold for each cached prompt, enabling reliable cache responses without additional training.<n>Our experiments show that vCache consistently meets the specified error bounds while outperforming state-of-the-art static-threshold and fine-tuned embedding baselines.
arXiv Detail & Related papers (2025-02-06T04:16:20Z)
PARIS: A Practical, Adaptive Trace-Fetching and Real-Time Malicious Behavior Detection System [6.068607290592521]
We propose adaptive trace fetching, lightweight, real-time malicious behavior detection system. Specifically, we monitor malicious behavior with Event Tracing for Windows (ETW) and learn to selectively collect maliciousness-related APIs or call stacks. As a result, we can monitor a wider range of APIs and detect more intricate attack behavior.
arXiv Detail & Related papers (2024-11-02T14:52:04Z)
Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain. We propose an adversarial algorithm to make the retriever component robust against distribution shift. We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z)
RelationTrack: Relation-aware Multiple Object Tracking with Decoupled Representation [3.356734463419838]
Existing online multiple object tracking (MOT) algorithms often consist of two subtasks, detection and re-identification (ReID) In order to enhance the inference speed and reduce the complexity, current methods commonly integrate these double subtasks into a unified framework. We devise a module named Global Context Disentangling (GCD) that decouples the learned representation into detection-specific and ReID-specific embeddings. To resolve this restriction, we develop a module, referred to as Guided Transformer (GTE), by combining the powerful reasoning ability of Transformer encoder and deformable attention.
arXiv Detail & Related papers (2021-05-10T13:00:40Z)
SegmentMeIfYouCan: A Benchmark for Anomaly Segmentation [111.61261419566908]
Deep neural networks (DNNs) are usually trained on a closed set of semantic classes. They are ill-equipped to handle previously-unseen objects. detecting and localizing such objects is crucial for safety-critical applications such as perception for automated driving.
arXiv Detail & Related papers (2021-04-30T07:58:19Z)
Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation. We introduce a novel approach for more accurate and efficient unseen-temporal segmentation. We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z)
Joint Detection and Tracking in Videos with Identification Features [36.55599286568541]
We propose the first joint optimization of detection, tracking and re-identification features for videos. Our method reaches the state-of-the-art on MOT, it ranks 1st in the UA-DETRAC'18 tracking challenge among online trackers, and 3rd overall.
arXiv Detail & Related papers (2020-05-21T21:06:40Z)
EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system. It can be trained in one shot on both fully and weakly-annotated data. It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.