MUSE: Model-Agnostic Tabular Watermarking via Multi-Sample Selection
- URL: http://arxiv.org/abs/2505.24267v1
- Date: Fri, 30 May 2025 06:45:31 GMT
- Title: MUSE: Model-Agnostic Tabular Watermarking via Multi-Sample Selection
- Authors: Liancheng Fang, Aiwei Liu, Henry Peng Zou, Yankai Chen, Hengrui Zhang, Zhongfen Deng, Philip S. Yu,
- Abstract summary: MUSE is a watermarking algorithm for tabular generative models.<n>It embeds watermarks by generating multiple candidate samples and selecting one based on a specialized scoring function.<n>It achieves state-of-the-art watermark detectability and robustness against various attacks.
- Score: 39.03834482470989
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce MUSE, a watermarking algorithm for tabular generative models. Previous approaches typically leverage DDIM invertibility to watermark tabular diffusion models, but tabular diffusion models exhibit significantly poorer invertibility compared to other modalities, compromising performance. Simultaneously, tabular diffusion models require substantially less computation than other modalities, enabling a multi-sample selection approach to tabular generative model watermarking. MUSE embeds watermarks by generating multiple candidate samples and selecting one based on a specialized scoring function, without relying on model invertibility. Our theoretical analysis establishes the relationship between watermark detectability, candidate count, and dataset size, allowing precise calibration of watermarking strength. Extensive experiments demonstrate that MUSE achieves state-of-the-art watermark detectability and robustness against various attacks while maintaining data quality, and remains compatible with any tabular generative model supporting repeated sampling, effectively addressing key challenges in tabular data watermarking. Specifically, it reduces the distortion rates on fidelity metrics by 81-89%, while achieving a 1.0 TPR@0.1%FPR detection rate. Implementation of MUSE can be found at https://github.com/fangliancheng/MUSE.
Related papers
- Self-Navigated Residual Mamba for Universal Industrial Anomaly Detection [42.42739543127113]
Self-Navigated Residual Mamba (SNARM) is a novel framework for universal industrial anomaly detection.<n> SNARM iteratively refines anomaly detection by comparing test patches against adaptively selected in-image references.<n>Experiments on MVTec AD, MVTec 3D, and VisA benchmarks demonstrate that SNARM achieves state-of-the-art (SOTA) performance.
arXiv Detail & Related papers (2025-08-03T05:07:38Z) - TraceMark-LDM: Authenticatable Watermarking for Latent Diffusion Models via Binary-Guided Rearrangement [21.94988216476109]
We introduce TraceMark-LDM, an algorithm that integrates watermarking to attribute generated images while guaranteeing non-destructive performance.<n>Images synthesized using TraceMark-LDM exhibit superior quality and attribution accuracy compared to state-of-the-art (SOTA) techniques.
arXiv Detail & Related papers (2025-03-30T06:23:53Z) - Improved Unbiased Watermark for Large Language Models [59.00698153097887]
We introduce MCmark, a family of unbiased, Multi-Channel-based watermarks.<n>MCmark preserves the original distribution of the language model.<n>It offers significant improvements in detectability and robustness over existing unbiased watermarks.
arXiv Detail & Related papers (2025-02-16T21:02:36Z) - TabularMark: Watermarking Tabular Datasets for Machine Learning [20.978995194849297]
We propose a hypothesis testing-based watermarking scheme, TabularMark.
Data noise partitioning is utilized for data perturbation during embedding.
Experiments on real-world and synthetic datasets demonstrate the superiority of TabularMark in detectability, non-intrusiveness, and robustness.
arXiv Detail & Related papers (2024-06-21T02:58:45Z) - MAPL: Memory Augmentation and Pseudo-Labeling for Semi-Supervised Anomaly Detection [0.0]
A new meth-odology for detecting surface defects in in-dustrial settings is introduced, referred to as Memory Augmentation and Pseudo-Labeling(MAPL)<n>The methodology first in-troduces an anomaly simulation strategy, which significantly improves the model's ability to recognize rare or unknown anom-aly types.<n>An end-to-end learning framework is employed by MAPL to identify the abnormal regions directly from the input data.
arXiv Detail & Related papers (2024-05-10T02:26:35Z) - TokenMark: A Modality-Agnostic Watermark for Pre-trained Transformers [67.57928750537185]
TokenMark is a robust, modality-agnostic, robust watermarking system for pre-trained models.<n>It embeds the watermark by fine-tuning the pre-trained model on a set of specifically permuted data samples.<n>It significantly improves the robustness, efficiency, and universality of model watermarking.
arXiv Detail & Related papers (2024-03-09T08:54:52Z) - IBADR: an Iterative Bias-Aware Dataset Refinement Framework for
Debiasing NLU models [52.03761198830643]
We propose IBADR, an Iterative Bias-Aware dataset Refinement framework.
We first train a shallow model to quantify the bias degree of samples in the pool.
Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator.
In this way, this generator can effectively learn the correspondence relationship between bias indicators and samples.
arXiv Detail & Related papers (2023-11-01T04:50:38Z) - On Calibrating Diffusion Probabilistic Models [78.75538484265292]
diffusion probabilistic models (DPMs) have achieved promising results in diverse generative tasks.
We propose a simple way for calibrating an arbitrary pretrained DPM, with which the score matching loss can be reduced and the lower bounds of model likelihood can be increased.
Our calibration method is performed only once and the resulting models can be used repeatedly for sampling.
arXiv Detail & Related papers (2023-02-21T14:14:40Z) - Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction.
We put forward an alternative measure of anomaly score to replace the reconstruction-based metric.
Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.