Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning
- URL: http://arxiv.org/abs/2001.04935v1
- Date: Tue, 14 Jan 2020 17:48:52 GMT
- Title: Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning
- Authors: Roei Schuster, Tal Schuster, Yoav Meri, Vitaly Shmatikov
- Abstract summary: We show that an attacker can control the "meaning" of new and existing words by changing their locations in the embedding space.
An attack on the embedding can affect diverse downstream tasks, demonstrating for the first time the power of data poisoning in transfer learning scenarios.
- Score: 29.181547214915238
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Word embeddings, i.e., low-dimensional vector representations such as GloVe
and SGNS, encode word "meaning" in the sense that distances between words'
vectors correspond to their semantic proximity. This enables transfer learning
of semantics for a variety of natural language processing tasks.
Word embeddings are typically trained on large public corpora such as
Wikipedia or Twitter. We demonstrate that an attacker who can modify the corpus
on which the embedding is trained can control the "meaning" of new and existing
words by changing their locations in the embedding space. We develop an
explicit expression over corpus features that serves as a proxy for distance
between words and establish a causative relationship between its values and
embedding distances. We then show how to use this relationship for two
adversarial objectives: (1) make a word a top-ranked neighbor of another word,
and (2) move a word from one semantic cluster to another.
An attack on the embedding can affect diverse downstream tasks, demonstrating
for the first time the power of data poisoning in transfer learning scenarios.
We use this attack to manipulate query expansion in information retrieval
systems such as resume search, make certain names more or less visible to named
entity recognition models, and cause new words to be translated to a particular
target word regardless of the language. Finally, we show how the attacker can
generate linguistically likely corpus modifications, thus fooling defenses that
attempt to filter implausible sentences from the corpus using a language model.
Related papers
- Can Word Sense Distribution Detect Semantic Changes of Words? [35.17635565325166]
We show that word sense distributions can be accurately used to predict semantic changes of words in English, German, Swedish and Latin.
Our experimental results on SemEval 2020 Task 1 dataset show that word sense distributions can be accurately used to predict semantic changes of words.
arXiv Detail & Related papers (2023-10-16T13:41:27Z) - Unsupervised Semantic Variation Prediction using the Distribution of
Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications.
We argue that mean representations alone cannot accurately capture such semantic variations.
We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z) - Simple, Interpretable and Stable Method for Detecting Words with Usage
Change across Corpora [54.757845511368814]
The problem of comparing two bodies of text and searching for words that differ in their usage arises often in digital humanities and computational social science.
This is commonly approached by training word embeddings on each corpus, aligning the vector spaces, and looking for words whose cosine distance in the aligned space is large.
We propose an alternative approach that does not use vector space alignment, and instead considers the neighbors of each word.
arXiv Detail & Related papers (2021-12-28T23:46:00Z) - UCPhrase: Unsupervised Context-aware Quality Phrase Tagging [63.86606855524567]
UCPhrase is a novel unsupervised context-aware quality phrase tagger.
We induce high-quality phrase spans as silver labels from consistently co-occurring word sequences.
We show that our design is superior to state-of-the-art pre-trained, unsupervised, and distantly supervised methods.
arXiv Detail & Related papers (2021-05-28T19:44:24Z) - Embodying Pre-Trained Word Embeddings Through Robot Actions [9.048164930020404]
Properly responding to various linguistic expressions, including polysemous words, is an important ability for robots.
Previous studies have shown that robots can use words that are not included in the action-description paired datasets by using pre-trained word embeddings.
We transform the pre-trained word embeddings to embodied ones by using the robot's sensory-motor experiences.
arXiv Detail & Related papers (2021-04-17T12:04:49Z) - Fake it Till You Make it: Self-Supervised Semantic Shifts for
Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change.
We show that our method can be used for the detection of semantic change with any alignment method.
We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z) - Adversarial Semantic Collisions [129.55896108684433]
We study semantic collisions: texts that are semantically unrelated but judged as similar by NLP models.
We develop gradient-based approaches for generating semantic collisions.
We show how to generate semantic collisions that evade perplexity-based filtering.
arXiv Detail & Related papers (2020-11-09T20:42:01Z) - Speakers Fill Lexical Semantic Gaps with Context [65.08205006886591]
We operationalise the lexical ambiguity of a word as the entropy of meanings it can take.
We find significant correlations between our estimate of ambiguity and the number of synonyms a word has in WordNet.
This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.
arXiv Detail & Related papers (2020-10-05T17:19:10Z) - Interactive Re-Fitting as a Technique for Improving Word Embeddings [0.0]
We make it possible for humans to adjust portions of a word embedding space by moving sets of words closer to one another.
Our approach allows users to trigger selective post-processing as they interact with and assess potential bias in word embeddings.
arXiv Detail & Related papers (2020-09-30T21:54:22Z) - Lexical Sememe Prediction using Dictionary Definitions by Capturing
Local Semantic Correspondence [94.79912471702782]
Sememes, defined as the minimum semantic units of human languages, have been proven useful in many NLP tasks.
We propose a Sememe Correspondence Pooling (SCorP) model, which is able to capture this kind of matching to predict sememes.
We evaluate our model and baseline methods on a famous sememe KB HowNet and find that our model achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-01-16T17:30:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.