Embedding Trust at Scale: Physics-Aware Neural Watermarking for Secure and Verifiable Data Pipelines
- URL: http://arxiv.org/abs/2506.12032v1
- Date: Thu, 22 May 2025 21:14:45 GMT
- Title: Embedding Trust at Scale: Physics-Aware Neural Watermarking for Secure and Verifiable Data Pipelines
- Authors: Krti Tallam,
- Abstract summary: We present a robust neural watermarking framework for scientific data integrity.<n>Using a convolutional autoencoder, binary messages are invisibly embedded into structured data such as temperature, vorticity, and geopotential.<n>Our approach achieves $>$98% bit accuracy and visually indistinguishable reconstructions across ERA5 and Navier-Stokes datasets.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a robust neural watermarking framework for scientific data integrity, targeting high-dimensional fields common in climate modeling and fluid simulations. Using a convolutional autoencoder, binary messages are invisibly embedded into structured data such as temperature, vorticity, and geopotential. Our method ensures watermark persistence under lossy transformations - including noise injection, cropping, and compression - while maintaining near-original fidelity (sub-1\% MSE). Compared to classical singular value decomposition (SVD)-based watermarking, our approach achieves $>$98\% bit accuracy and visually indistinguishable reconstructions across ERA5 and Navier-Stokes datasets. This system offers a scalable, model-compatible tool for data provenance, auditability, and traceability in high-performance scientific workflows, and contributes to the broader goal of securing AI systems through verifiable, physics-aware watermarking. We evaluate on physically grounded scientific datasets as a representative stress-test; the framework extends naturally to other structured domains such as satellite imagery and autonomous-vehicle perception streams.
Related papers
- A workflow for generating synthetic LiDAR datasets in simulation environments [0.0]
This paper presents a simulation workflow for generating synthetic LiDAR datasets to support autonomous vehicle perception, robotics research, and sensor security analysis.<n>We integrate time-of-flight LiDAR, image sensors, and two dimensional scanners onto a simulated vehicle platform operating within an urban scenario.<n>The study examines potential security vulnerabilities in LiDAR data, such as adversarial point injection and spoofing attacks, and demonstrates how synthetic datasets can facilitate the evaluation of defense strategies.
arXiv Detail & Related papers (2025-06-20T17:56:15Z) - Learning Underwater Active Perception in Simulation [51.205673783866146]
Turbidity can jeopardise the whole mission as it may prevent correct visual documentation of the inspected structures.<n>Previous works have introduced methods to adapt to turbidity and backscattering.<n>We propose a simple yet efficient approach to enable high-quality image acquisition of assets in a broad range of water conditions.
arXiv Detail & Related papers (2025-04-23T06:48:38Z) - Efficient Self-Supervised Learning for Earth Observation via Dynamic Dataset Curation [67.23953699167274]
Self-supervised learning (SSL) has enabled the development of vision foundation models for Earth Observation (EO)<n>In EO, this challenge is amplified by the redundancy and heavy-tailed distributions common in satellite imagery.<n>We propose a dynamic dataset pruning strategy designed to improve SSL pre-training by maximizing dataset diversity and balance.
arXiv Detail & Related papers (2025-04-09T15:13:26Z) - Physically Interpretable Representation and Controlled Generation for Turbulence Data [39.42376941186934]
This paper proposes a data-driven approach to encode high-dimensional scientific data into low-dimensional, physically meaningful representations.<n>We validate our approach using 2D Navier-Stokes simulations of flow past a cylinder over a range of Reynolds numbers.
arXiv Detail & Related papers (2025-01-31T17:51:14Z) - Iterative Encoding-Decoding VAEs Anomaly Detection in NOAA's DART Time Series: A Machine Learning Approach for Enhancing Data Integrity for NASA's GRACE-FO Verification and Validation [3.4265828682659705]
This paper introduces an Iterative ational-Decoding Variencoders (Iterative ational-Decoding VAEs) model to improve the quality of DART time series.<n>Iterative ational-Decoding VAEs progressively remove anomalies while preserving the data's latent structure.<n>This data processing method tsunami detection underpins future climate modeling with improved interpretability and reliability.
arXiv Detail & Related papers (2024-12-20T22:19:11Z) - PIGUIQA: A Physical Imaging Guided Perceptual Framework for Underwater Image Quality Assessment [59.9103803198087]
We propose a Physical Imaging Guided perceptual framework for Underwater Image Quality Assessment (UIQA)<n>By leveraging underwater radiative transfer theory, we integrate physics-based imaging estimations to establish quantitative metrics for these distortions.<n>The proposed model accurately predicts image quality scores and achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-12-20T03:31:45Z) - Attention-Based Reconstruction of Full-Field Tsunami Waves from Sparse Tsunameter Networks [0.0]
We focus on the Tsunami Data Assimilation Method, which generates forecasts from tsunameter networks.<n>Our model is used to reconstruct high-resolution tsunami wavefields from extremely sparse observations.<n>We demonstrate that our approach significantly outperforms the Linear Interpolation with Huygens-Fresnel Principle in generating dense observation networks.
arXiv Detail & Related papers (2024-11-20T00:42:40Z) - Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems [49.11170948406405]
We propose an unsupervised method to estimate the physical parameters of known, continuous governing equations from single videos.<n>We take the field closer to reality by recording Delfys75: our own real-world dataset of 75 videos for five different types of dynamical systems.
arXiv Detail & Related papers (2024-10-02T09:44:54Z) - Domain Watermark: Effective and Harmless Dataset Copyright Protection is
Closed at Hand [96.26251471253823]
backdoor-based dataset ownership verification (DOV) is currently the only feasible approach to protect the copyright of open-source datasets.
We make watermarked models (trained on the protected dataset) correctly classify some hard' samples that will be misclassified by the benign model.
arXiv Detail & Related papers (2023-10-09T11:23:05Z) - Phase2vec: Dynamical systems embedding with a physics-informed
convolutional network [1.6058099298620423]
We propose an embedding method that learns high-quality, physically-meaningful representations of 2D dynamical systems without supervision.
Our embeddings encode important physical properties of the underlying data, including the stability of fixed points, conservation of energy, and the incompressibility of flows.
arXiv Detail & Related papers (2022-12-07T18:54:52Z) - SimuShips -- A High Resolution Simulation Dataset for Ship Detection
with Precise Annotations [0.0]
State-of-the-art obstacle detection algorithms are based on convolutional neural networks (CNNs)
SimuShips is a publicly available simulation-based dataset for maritime environments.
arXiv Detail & Related papers (2022-09-22T07:33:31Z) - Federated Learning in the Sky: Aerial-Ground Air Quality Sensing
Framework with UAV Swarms [53.38353133198842]
Air quality significantly affects human health, it is increasingly important to accurately and timely predict the Air Quality Index (AQI)
This paper proposes a new federated learning-based aerial-ground air quality sensing framework for fine-grained 3D air quality monitoring and forecasting.
For ground sensing systems, we propose a Graph Convolutional neural network-based Long Short-Term Memory (GC-LSTM) model to achieve accurate, real-time and future AQI inference.
arXiv Detail & Related papers (2020-07-23T13:32:47Z) - Why Normalizing Flows Fail to Detect Out-of-Distribution Data [51.552870594221865]
Normalizing flows fail to distinguish between in- and out-of-distribution data.
We demonstrate that flows learn local pixel correlations and generic image-to-latent-space transformations.
We show that by modifying the architecture of flow coupling layers we can bias the flow towards learning the semantic structure of the target data.
arXiv Detail & Related papers (2020-06-15T17:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.