Related papers: Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck

Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck

URL: http://arxiv.org/abs/2602.22237v1
Date: Mon, 23 Feb 2026 21:34:25 GMT
Title: Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck
Authors: Prasanna Kumar, Nishank Soni, Gaurang Munje,
Abstract summary: This paper characterizes the operational conditions under which full or partial re-hashing becomes unavoidable.<n>The proposed framework assigns globally unique composite identifiers to data blocks at ingestion time-independent of content analysis enabling instantaneous delta during DR without any cryptographic overhead.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Distributed storage architectures are foundational to modern cloud-native infrastructure, yet a critical operational bottleneck persists within disaster recovery (DR) workflows: the dependence on content-based cryptographic hashing for data identification and synchronization. While hash-based deduplication is effective for storage efficiency in steady-state operation, it becomes a systemic liability during failover and failback events when hash indexes are stale, incomplete, or must be rebuilt following a crash. This paper precisely characterizes the operational conditions under which full or partial re-hashing becomes unavoidable. The paper also analyzes the downstream impact of cryptographic re-hashing on Recovery Time Objective (RTO) compliance, and proposes a generalized architectural shift toward deterministic, metadata-driven identification. The proposed framework assigns globally unique composite identifiers to data blocks at ingestion time-independent of content analysis enabling instantaneous delta computation during DR without any cryptographic overhead.

Related papers

FlashMem: Distilling Intrinsic Latent Memory via Computation Reuse [4.210760734549566]
FlashMem is a framework that distills intrinsic memory directly from transient reasoning states via computation reuse.<n>Experiments demonstrate that FlashMem matches the performance of heavy baselines while reducing inference latency by 5 times.
arXiv Detail & Related papers (2026-01-09T03:27:43Z)
Unleashing Degradation-Carrying Features in Symmetric U-Net: Simpler and Stronger Baselines for All-in-One Image Restoration [52.82397287366076]
All-in-one image restoration aims to handle diverse degradations (e.g., noise, blur, adverse weather) within a unified framework.<n>In this work, we reveal a critical insight: well-crafted feature extraction inherently encodes degradation-carrying information.<n>Our symmetric design preserves intrinsic degradation signals robustly, rendering simple additive fusion in skip connections.
arXiv Detail & Related papers (2025-12-11T12:20:31Z)
Parallel Algorithms for Combined Regularized Support Vector Machines: Application in Music Genre Classification [5.98174311891588]
We develop a distributed parallel alternating direction multipliers (ADMM) to efficiently compute CR-SVMs when data is stored in a distributed manner.<n>Experiments on synthetic and free music archiv datasets demonstrate the reliability, stability, and efficiency of the algorithm.
arXiv Detail & Related papers (2025-12-08T11:41:06Z)
Knowledge-Informed Neural Network for Complex-Valued SAR Image Recognition [51.03674130115878]
We introduce the Knowledge-Informed Neural Network (KINN), a lightweight framework built upon a novel "compression-aggregation-compression" architecture.<n>KINN establishes a state-of-the-art in parameter-efficient recognition, offering exceptional generalization in data-scarce and out-of-distribution scenarios.
arXiv Detail & Related papers (2025-10-23T07:12:26Z)
Revisiting the Privacy Risks of Split Inference: A GAN-Based Data Reconstruction Attack via Progressive Feature Optimization [49.32786615205064]
Split Inference (SI) partitions computation between edge devices and the cloud to reduce latency and protect user privacy.<n>Recent advances in Data Reconstruction Attacks (DRAs) reveal that intermediate features exchanged in SI can be exploited to recover sensitive input data.<n>Existing DRAs are typically effective only on shallow models and fail to fully leverage semantic priors.<n>We propose a novel GAN-based DRA framework with Progressive Feature Optimization (PFO), which decomposes the generator into hierarchical blocks and incrementally refines intermediate representations to enhance the semantic fidelity of reconstructed images.
arXiv Detail & Related papers (2025-08-28T10:00:39Z)
HexaMorphHash HMH- Homomorphic Hashing for Secure and Efficient Cryptographic Operations in Data Integrity Verification [0.0]
This paper introduces an innovative approach using a lattice based homomorphic hash validating HexaHashMorph.<n>Our contributions present a viable solution for frequent update dissemination in expansive distributed systems, safeguarding both data integrity and system performance.
arXiv Detail & Related papers (2025-07-01T18:53:23Z)
Dataset Protection via Watermarked Canaries in Retrieval-Augmented LLMs [67.0310240737424]
We introduce a novel approach to safeguard the ownership of text datasets and effectively detect unauthorized use by the RA-LLMs.<n>Our approach preserves the original data completely unchanged while protecting it by inserting specifically designed canary documents into the IP dataset.<n>During the detection process, unauthorized usage is identified by querying the canary documents and analyzing the responses of RA-LLMs.
arXiv Detail & Related papers (2025-02-15T04:56:45Z)
Structured Token Retention and Computational Memory Paths in Large Language Models [0.0]
This paper introduces a probabilistic selection framework that dynamically adjusts token persistence based on contextual significance.<n>It is extended through hierarchical memory allocation, refining retention efficiency through structured reallocation of token embeddings.<n>The integration of STR and CMP into an open-source model illustrates the adaptability of structured memory retention methodologies.
arXiv Detail & Related papers (2025-02-05T11:59:22Z)
Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval [67.52910255064762]
We design a simple dual-stream structure, including a temporal layer and a hash layer. We first design a simple dual-stream structure, including a temporal layer and a hash layer. With the help of semantic similarity knowledge obtained from self-supervision, the hash layer learns to capture information for semantic retrieval. In this way, the model naturally preserves the disentangled semantics into binary codes.
arXiv Detail & Related papers (2023-10-12T03:21:12Z)
BGaitR-Net: Occluded Gait Sequence reconstructionwith temporally constrained model for gait recognition [1.151614782416873]
We develop novel deep learning-based algorithms to identify occluded frames in an input sequence. We then reconstruct these frames by exploiting next-temporal information present in the gait sequence. Our LSTM-based model reconstructs occlusion and generates frames that are temporally consistent with the periodic pattern of a gait cycle.
arXiv Detail & Related papers (2021-10-18T18:28:18Z)
New advances in enumerative biclustering algorithms with online partitioning [80.22629846165306]
This paper further extends RIn-Close_CVC, a biclustering algorithm capable of performing an efficient, complete, correct and non-redundant enumeration of maximal biclusters with constant values on columns in numerical datasets. The improved algorithm is called RIn-Close_CVC3, keeps those attractive properties of RIn-Close_CVC, and is characterized by: a drastic reduction in memory usage; a consistent gain in runtime.
arXiv Detail & Related papers (2020-03-07T14:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.