Related papers: WASA: WAtermark-based Source Attribution for Large Language Model-Generated Data

WASA: WAtermark-based Source Attribution for Large Language Model-Generated Data

URL: http://arxiv.org/abs/2310.00646v1
Date: Sun, 1 Oct 2023 12:02:57 GMT
Title: WASA: WAtermark-based Source Attribution for Large Language Model-Generated Data
Authors: Jingtan Wang, Xinyang Lu, Zitong Zhao, Zhongxiang Dai, Chuan-Sheng Foo, See-Kiong Ng, Bryan Kian Hsiang Low
Abstract summary: Large language models (LLMs) generate synthetic texts with embedded watermarks that contain information about their source(s) We propose a WAtermarking for Source Attribution (WASA) framework that satisfies key properties due to our algorithmic designs. Our framework achieves effective source attribution and data provenance.
Score: 60.759755177369364
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The impressive performances of large language models (LLMs) and their immense potential for commercialization have given rise to serious concerns over the intellectual property (IP) of their training data. In particular, the synthetic texts generated by LLMs may infringe the IP of the data being used to train the LLMs. To this end, it is imperative to be able to (a) identify the data provider who contributed to the generation of a synthetic text by an LLM (source attribution) and (b) verify whether the text data from a data provider has been used to train an LLM (data provenance). In this paper, we show that both problems can be solved by watermarking, i.e., by enabling an LLM to generate synthetic texts with embedded watermarks that contain information about their source(s). We identify the key properties of such watermarking frameworks (e.g., source attribution accuracy, robustness against adversaries), and propose a WAtermarking for Source Attribution (WASA) framework that satisfies these key properties due to our algorithmic designs. Our WASA framework enables an LLM to learn an accurate mapping from the texts of different data providers to their corresponding unique watermarks, which sets the foundation for effective source attribution (and hence data provenance). Extensive empirical evaluations show that our WASA framework achieves effective source attribution and data provenance.

Related papers

Watermarking LLM-Generated Datasets in Downstream Tasks [26.31171813997747]
Large Language Models (LLMs) have experienced rapid advancements, with applications spanning a wide range of fields, including sentiment classification, review generation, and question answering.<n>Due to their efficiency and versatility, researchers and companies increasingly employ LLM-generated data to train their models.<n>The inability to track content produced by LLMs poses a significant challenge, potentially leading to copyright infringement for the LLM owners.<n>We propose a method for injecting watermarks into LLM-generated datasets, enabling the tracking of downstream tasks to detect whether these datasets were produced using the original LLM.
arXiv Detail & Related papers (2025-06-16T13:51:49Z)
Robust Detection of LLM-Generated Text: A Comparative Analysis [0.276240219662896]
Large language models can be widely integrated into many aspects of life, and their output can quickly fill all network resources. It becomes increasingly important to develop powerful detectors for the generated text. This detector is essential to prevent the potential misuse of these technologies and to protect areas such as social media from the negative effects.
arXiv Detail & Related papers (2024-11-09T18:27:15Z)
Understanding the Effects of Human-written Paraphrases in LLM-generated Text Detection [7.242609314791262]
Human & LLM Paraphrase Collection (HLPC) is a first-of-its-kind dataset that incorporates human-written texts and paraphrases. We perform classification experiments that incorporate human-written paraphrases, watermarked and non-watermarked LLM-generated documents from GPT and OPT, and LLM-generated paraphrases from DIPPER and BART. Results show that the inclusion of human-written paraphrases has a significant impact of LLM-generated detector performance, promoting TPR@1%FPR with a possible trade-off of AUROC and accuracy.
arXiv Detail & Related papers (2024-11-06T10:06:21Z)
A Bayesian Approach to Harnessing the Power of LLMs in Authorship Attribution [57.309390098903]
Authorship attribution aims to identify the origin or author of a document. Large Language Models (LLMs) with their deep reasoning capabilities and ability to maintain long-range textual associations offer a promising alternative. Our results on the IMDb and blog datasets show an impressive 85% accuracy in one-shot authorship classification across ten authors.
arXiv Detail & Related papers (2024-10-29T04:14:23Z)
Evaluation of Attribution Bias in Retrieval-Augmented Large Language Models [47.694137341509304]
We evaluate the attribution sensitivity and bias with respect to authorship information in large language models. Our results show that adding authorship information to source documents can significantly change the attribution quality of LLMs by 3% to 18%. Our findings indicate that metadata of source documents can influence LLMs' trust, and how they attribute their answers.
arXiv Detail & Related papers (2024-10-16T08:55:49Z)
CopyLens: Dynamically Flagging Copyrighted Sub-Dataset Contributions to LLM Outputs [39.425944445393945]
We introduce CopyLens, a framework to analyze how copyrighted datasets may influence Large Language Models responses. Experiments show that CopyLens improves efficiency and accuracy by 15.2% over our proposed baseline, 58.7% over prompt engineering methods, and 0.21 AUC over OOD detection baselines.
arXiv Detail & Related papers (2024-10-06T11:41:39Z)
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? [62.72729485995075]
We investigate the effectiveness of watermarking as a deterrent against the generation of copyrighted texts. We find that watermarking adversely affects the success rate of Membership Inference Attacks (MIAs) We propose an adaptive technique to improve the success rate of a recent MIA under watermarking.
arXiv Detail & Related papers (2024-07-24T16:53:09Z)
SPOT: Text Source Prediction from Originality Score Thresholding [6.790905400046194]
countermeasures aim at detecting misinformation, usually involve domain specific models trained to recognize the relevance of any information. Instead of evaluating the validity of the information, we propose to investigate LLM generated text from the perspective of trust.
arXiv Detail & Related papers (2024-05-30T21:51:01Z)
Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering [9.86691461253151]
We introduce a novel method for attribution in contextual question answering, leveraging the hidden state representations of large language models (LLMs) Our approach bypasses the need for extensive model retraining and retrieval model overhead, offering granular attributions and preserving the quality of generated answers. We present Verifiability-granular, an attribution dataset which has token level annotations for LLM generations in the contextual question answering setup.
arXiv Detail & Related papers (2024-05-28T09:12:44Z)
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks. Our method achieves state-of-the-art results on well-established TAG datasets. Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z)
LLMDet: A Third Party Large Language Models Generated Text Detection Tool [119.0952092533317]
Large language models (LLMs) are remarkably close to high-quality human-authored text. Existing detection tools can only differentiate between machine-generated and human-authored text. We propose LLMDet, a model-specific, secure, efficient, and extendable detection tool.
arXiv Detail & Related papers (2023-05-24T10:45:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.