Sentence Embedding Leaks More Information than You Expect: Generative
Embedding Inversion Attack to Recover the Whole Sentence
- URL: http://arxiv.org/abs/2305.03010v1
- Date: Thu, 4 May 2023 17:31:41 GMT
- Title: Sentence Embedding Leaks More Information than You Expect: Generative
Embedding Inversion Attack to Recover the Whole Sentence
- Authors: Haoran Li, Mingshi Xu, Yangqiu Song
- Abstract summary: We propose a generative embedding inversion attack (GEIA) that aims to reconstruct input sequences based only on their sentence embeddings.
Given the black-box access to a language model, we treat sentence embeddings as initial tokens' representations and train or fine-tune a powerful decoder model to decode the whole sequences directly.
- Score: 37.63047048491312
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sentence-level representations are beneficial for various natural language
processing tasks. It is commonly believed that vector representations can
capture rich linguistic properties. Currently, large language models (LMs)
achieve state-of-the-art performance on sentence embedding. However, some
recent works suggest that vector representations from LMs can cause information
leakage. In this work, we further investigate the information leakage issue and
propose a generative embedding inversion attack (GEIA) that aims to reconstruct
input sequences based only on their sentence embeddings. Given the black-box
access to a language model, we treat sentence embeddings as initial tokens'
representations and train or fine-tune a powerful decoder model to decode the
whole sequences directly. We conduct extensive experiments to demonstrate that
our generative inversion attack outperforms previous embedding inversion
attacks in classification metrics and generates coherent and contextually
similar sentences as the original inputs.
Related papers
- On Adversarial Examples for Text Classification by Perturbing Latent Representations [0.0]
We show that deep learning is vulnerable to adversarial examples in text classification.
This weakness indicates that deep learning is not very robust.
We create a framework that measures the robustness of a text classifier by using the gradients of the classifier.
arXiv Detail & Related papers (2024-05-06T18:45:18Z) - Self-Adaptive Reconstruction with Contrastive Learning for Unsupervised
Sentence Embeddings [24.255946996327104]
Unsupervised sentence embeddings task aims to convert sentences to semantic vector representations.
Due to the token bias in pretrained language models, the models can not capture the fine-grained semantics in sentences.
We propose a novel Self-Adaptive Reconstruction Contrastive Sentence Embeddings framework.
arXiv Detail & Related papers (2024-02-23T07:28:31Z) - SenTest: Evaluating Robustness of Sentence Encoders [0.4194295877935868]
This work focuses on evaluating the robustness of the sentence encoders.
We employ several adversarial attacks to evaluate its robustness.
The results of the experiments strongly undermine the robustness of sentence encoders.
arXiv Detail & Related papers (2023-11-29T15:21:35Z) - GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator [114.8954615026781]
We propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator.
GanLM is trained with two pre-training objectives: replaced token detection and replaced token denoising.
Experiments in language generation benchmarks show that GanLM with the powerful language understanding capability outperforms various strong pre-trained language models.
arXiv Detail & Related papers (2022-12-20T12:51:11Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Quark: Controllable Text Generation with Reinforced Unlearning [68.07749519374089]
Large-scale language models often learn behaviors that are misaligned with user expectations.
We introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property.
For unlearning toxicity, negative sentiment, and repetition, our experiments show that Quark outperforms both strong baselines and state-of-the-art reinforcement learning methods.
arXiv Detail & Related papers (2022-05-26T21:11:51Z) - Cross-Thought for Sentence Encoder Pre-training [89.32270059777025]
Cross-Thought is a novel approach to pre-training sequence encoder.
We train a Transformer-based sequence encoder over a large set of short sequences.
Experiments on question answering and textual entailment tasks demonstrate that our pre-trained encoder can outperform state-of-the-art encoders.
arXiv Detail & Related papers (2020-10-07T21:02:41Z) - Contextualized Perturbation for Textual Adversarial Attack [56.370304308573274]
Adversarial examples expose the vulnerabilities of natural language processing (NLP) models.
This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs.
arXiv Detail & Related papers (2020-09-16T06:53:15Z) - Stacked DeBERT: All Attention in Incomplete Data for Text Classification [8.900866276512364]
We propose Stacked DeBERT, short for Stacked Denoising Bidirectional Representations from Transformers.
Our model shows improved F1-scores and better robustness in informal/incorrect texts present in tweets and in texts with Speech-to-Text error in sentiment and intent classification tasks.
arXiv Detail & Related papers (2020-01-01T04:49:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.