Text Revealer: Private Text Reconstruction via Model Inversion Attacks
against Transformers
- URL: http://arxiv.org/abs/2209.10505v1
- Date: Wed, 21 Sep 2022 17:05:12 GMT
- Title: Text Revealer: Private Text Reconstruction via Model Inversion Attacks
against Transformers
- Authors: Ruisi Zhang, Seira Hidano, Farinaz Koushanfar
- Abstract summary: We formulate emphText Revealer -- the first model inversion attack for text reconstruction against text classification with transformers.
Our attacks faithfully reconstruct private texts included in training data with access to the target model.
Our experiments demonstrate that our attacks are effective for datasets with different text lengths and can reconstruct private texts with accuracy.
- Score: 22.491785618530397
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text classification has become widely used in various natural language
processing applications like sentiment analysis. Current applications often use
large transformer-based language models to classify input texts. However, there
is a lack of systematic study on how much private information can be inverted
when publishing models. In this paper, we formulate \emph{Text Revealer} -- the
first model inversion attack for text reconstruction against text
classification with transformers. Our attacks faithfully reconstruct private
texts included in training data with access to the target model. We leverage an
external dataset and GPT-2 to generate the target domain-like fluent text, and
then perturb its hidden state optimally with the feedback from the target
model. Our extensive experiments demonstrate that our attacks are effective for
datasets with different text lengths and can reconstruct private texts with
accuracy.
Related papers
- Leveraging Structure Knowledge and Deep Models for the Detection of Abnormal Handwritten Text [19.05500901000957]
We propose a two-stage detection algorithm that combines structure knowledge and deep models for handwritten text.
A shape regression network trained by a novel semi-supervised contrast training strategy is introduced and the positional relationship between the characters is fully employed.
Experiments on two handwritten text datasets show that the proposed method can greatly improve the detection performance.
arXiv Detail & Related papers (2024-10-15T14:57:10Z) - Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis [52.34110239735265]
We present Text Grouping Adapter (TGA), a module that can enable the utilization of various pre-trained text detectors to learn layout analysis.
Our comprehensive experiments demonstrate that, even with frozen pre-trained models, incorporating our TGA into various pre-trained text detectors and text spotters can achieve superior layout analysis performance.
arXiv Detail & Related papers (2024-05-13T05:48:35Z) - TextDiffuser-2: Unleashing the Power of Language Models for Text
Rendering [118.30923824681642]
TextDiffuser-2 aims to unleash the power of language models for text rendering.
We utilize the language model within the diffusion model to encode the position and texts at the line level.
We conduct extensive experiments and incorporate user studies involving human participants as well as GPT-4V.
arXiv Detail & Related papers (2023-11-28T04:02:40Z) - RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder
for Language Modeling [79.56442336234221]
We introduce RegaVAE, a retrieval-augmented language model built upon the variational auto-encoder (VAE)
It encodes the text corpus into a latent space, capturing current and future information from both source and target text.
Experimental results on various datasets demonstrate significant improvements in text generation quality and hallucination removal.
arXiv Detail & Related papers (2023-10-16T16:42:01Z) - Specializing Small Language Models towards Complex Style Transfer via
Latent Attribute Pre-Training [29.143887057933327]
We introduce the concept of complex text style transfer tasks, and constructed complex text datasets based on two widely applicable scenarios.
Our dataset is the first large-scale data set of its kind, with 700 rephrased sentences and 1,000 sentences from the game Genshin Impact.
arXiv Detail & Related papers (2023-09-19T21:01:40Z) - Verifying the Robustness of Automatic Credibility Assessment [50.55687778699995]
We show that meaning-preserving changes in input text can mislead the models.
We also introduce BODEGA: a benchmark for testing both victim models and attack methods on misinformation detection tasks.
Our experimental results show that modern large language models are often more vulnerable to attacks than previous, smaller solutions.
arXiv Detail & Related papers (2023-03-14T16:11:47Z) - A Benchmark Corpus for the Detection of Automatically Generated Text in
Academic Publications [0.02578242050187029]
This paper presents two datasets comprised of artificially generated research content.
In the first case, the content is completely generated by the GPT-2 model after a short prompt extracted from original papers.
The partial or hybrid dataset is created by replacing several sentences of abstracts with sentences that are generated by the Arxiv-NLP model.
We evaluate the quality of the datasets comparing the generated texts to aligned original texts using fluency metrics such as BLEU and ROUGE.
arXiv Detail & Related papers (2022-02-04T08:16:56Z) - How much do language models copy from their training data? Evaluating
linguistic novelty in text generation using RAVEN [63.79300884115027]
Current language models can generate high-quality text.
Are they simply copying text they have seen before, or have they learned generalizable linguistic abstractions?
We introduce RAVEN, a suite of analyses for assessing the novelty of generated text.
arXiv Detail & Related papers (2021-11-18T04:07:09Z) - Data-to-Text Generation with Iterative Text Editing [3.42658286826597]
We present a novel approach to data-to-text generation based on iterative text editing.
We first transform data items to text using trivial templates, and then we iteratively improve the resulting text by a neural model trained for the sentence fusion task.
The output of the model is filtered by a simple and reranked with an off-the-shelf pre-trained language model.
arXiv Detail & Related papers (2020-11-03T13:32:38Z) - Adversarial Watermarking Transformer: Towards Tracing Text Provenance
with Data Hiding [80.3811072650087]
We study natural language watermarking as a defense to help better mark and trace the provenance of text.
We introduce the Adversarial Watermarking Transformer (AWT) with a jointly trained encoder-decoder and adversarial training.
AWT is the first end-to-end model to hide data in text by automatically learning -- without ground truth -- word substitutions along with their locations.
arXiv Detail & Related papers (2020-09-07T11:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.