Quantum Inspired Word Representation and Computation
- URL: http://arxiv.org/abs/2004.02705v2
- Date: Wed, 8 Apr 2020 03:00:05 GMT
- Title: Quantum Inspired Word Representation and Computation
- Authors: Shen Li, Renfen Hu, Jinshan Wu
- Abstract summary: We represent words as density matrices, which are inherently capable of representing mixed states.
The experiment shows that the density matrix representation can effectively capture different aspects of word meaning.
We propose a novel method to combine the coherent summation and incoherent summation in the computation of both vectors and density matrices.
- Score: 13.35038288273036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Word meaning has different aspects, while the existing word representation
"compresses" these aspects into a single vector, and it needs further analysis
to recover the information in different dimensions. Inspired by quantum
probability, we represent words as density matrices, which are inherently
capable of representing mixed states. The experiment shows that the density
matrix representation can effectively capture different aspects of word meaning
while maintaining comparable reliability with the vector representation.
Furthermore, we propose a novel method to combine the coherent summation and
incoherent summation in the computation of both vectors and density matrices.
It achieves consistent improvement on word analogy task.
Related papers
- Quantization of Large Language Models with an Overdetermined Basis [73.79368761182998]
We introduce an algorithm for data quantization based on the principles of Kashin representation.
Our findings demonstrate that Kashin Quantization achieves competitive or superior quality in model performance.
arXiv Detail & Related papers (2024-04-15T12:38:46Z) - Robust Concept Erasure via Kernelized Rate-Distortion Maximization [38.19696482602788]
We propose a new distance metric learning-based objective, the Kernelized Rate-Distortion Maximizer (KRaM)
KRaM fits a transformation of representations to match a specified distance measure (defined by a labeled concept to erase) using a modified rate-distortion function.
We find that KRaM effectively erases various types of concepts: categorical, continuous, and vector-valued variables from data representations across diverse domains.
arXiv Detail & Related papers (2023-11-30T21:10:44Z) - Grounding and Distinguishing Conceptual Vocabulary Through Similarity
Learning in Embodied Simulations [4.507860128918788]
We present a novel method for using agent experiences gathered through an embodied simulation to ground contextualized word vectors to object representations.
We use similarity learning to make comparisons between different object types based on their properties when interacted with, and to extract common features pertaining to the objects' behavior.
arXiv Detail & Related papers (2023-05-23T04:22:00Z) - Sublinear Time Approximation of Text Similarity Matrices [50.73398637380375]
We introduce a generalization of the popular Nystr"om method to the indefinite setting.
Our algorithm can be applied to any similarity matrix and runs in sublinear time in the size of the matrix.
We show that our method, along with a simple variant of CUR decomposition, performs very well in approximating a variety of similarity matrices.
arXiv Detail & Related papers (2021-12-17T17:04:34Z) - Word2Box: Learning Word Representation Using Box Embeddings [28.080105878687185]
Learning vector representations for words is one of the most fundamental topics in NLP.
Our model, Word2Box, takes a region-based approach to the problem of word representation, representing words as $n$-dimensional rectangles.
We demonstrate improved performance on various word similarity tasks, particularly on less common words.
arXiv Detail & Related papers (2021-06-28T01:17:11Z) - Cross-Modal Discrete Representation Learning [73.68393416984618]
We present a self-supervised learning framework that learns a representation that captures finer levels of granularity across different modalities.
Our framework relies on a discretized embedding space created via vector quantization that is shared across different modalities.
arXiv Detail & Related papers (2021-06-10T00:23:33Z) - SemGloVe: Semantic Co-occurrences for GloVe from BERT [55.420035541274444]
GloVe learns word embeddings by leveraging statistical information from word co-occurrence matrices.
We propose SemGloVe, which distills semantic co-occurrences from BERT into static GloVe word embeddings.
arXiv Detail & Related papers (2020-12-30T15:38:26Z) - Accurate Word Representations with Universal Visual Guidance [55.71425503859685]
This paper proposes a visual representation method to explicitly enhance conventional word embedding with multiple-aspect senses from visual guidance.
We build a small-scale word-image dictionary from a multimodal seed dataset where each word corresponds to diverse related images.
Experiments on 12 natural language understanding and machine translation tasks further verify the effectiveness and the generalization capability of the proposed approach.
arXiv Detail & Related papers (2020-12-30T09:11:50Z) - Modelling Lexical Ambiguity with Density Matrices [3.7692411550925664]
We present three new neural models for learning density matrices from a corpus.
Test their ability to discriminate between word senses on a range of compositional datasets.
arXiv Detail & Related papers (2020-10-12T13:08:45Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - Unsupervised Summarization by Jointly Extracting Sentences and Keywords [12.387378783627762]
RepRank is an unsupervised graph-based ranking model for extractive multi-document summarization.
We show that salient sentences and keywords can be extracted in a joint and mutual reinforcement process using our learned representations.
Experiment results with multiple benchmark datasets show that RepRank achieved the best or comparable performance in ROUGE.
arXiv Detail & Related papers (2020-09-16T05:58:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.