Related papers: Word Rotator's Distance

Word Rotator's Distance

URL: http://arxiv.org/abs/2004.15003v3
Date: Mon, 16 Nov 2020 17:57:08 GMT
Title: Word Rotator's Distance
Authors: Sho Yokoi, Ryo Takahashi, Reina Akama, Jun Suzuki, Kentaro Inui
Abstract summary: Key principle in assessing textual similarity is measuring the degree of semantic overlap between two texts by considering the word alignment. We show that the norm of word vectors is a good proxy for word importance, and their angle is a good proxy for word similarity. We propose a method that first decouples word vectors into their norm and direction, and then computes alignment-based similarity.
Score: 50.67809662270474
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A key principle in assessing textual similarity is measuring the degree of semantic overlap between two texts by considering the word alignment. Such alignment-based approaches are intuitive and interpretable; however, they are empirically inferior to the simple cosine similarity between general-purpose sentence vectors. To address this issue, we focus on and demonstrate the fact that the norm of word vectors is a good proxy for word importance, and their angle is a good proxy for word similarity. Alignment-based approaches do not distinguish them, whereas sentence-vector approaches automatically use the norm as the word importance. Accordingly, we propose a method that first decouples word vectors into their norm and direction, and then computes alignment-based similarity using earth mover's distance (i.e., optimal transport cost), which we refer to as word rotator's distance. Besides, we find how to grow the norm and direction of word vectors (vector converter), which is a new systematic approach derived from sentence-vector estimation methods. On several textual similarity datasets, the combination of these simple proposed methods outperformed not only alignment-based approaches but also strong baselines. The source code is available at https://github.com/eumesy/wrd

Related papers

Contextualized Word Vector-based Methods for Discovering Semantic Differences with No Training nor Word Alignment [17.229611956178818]
We propose methods for discovering semantic differences in words appearing in two corpora. The key idea is that the coverage of meanings is reflected in the norm of its mean word vector. We show these advantages for native and non-native English corpora and also for historical corpora.
arXiv Detail & Related papers (2023-05-19T08:27:17Z)
Tsetlin Machine Embedding: Representing Words Using Logical Expressions [10.825099126920028]
We introduce a Tsetlin Machine-based autoencoder that learns logical clauses self-supervised. The clauses consist of contextual words like "black," "cup," and "hot" to define other words like "coffee" We evaluate our embedding approach on several intrinsic and extrinsic benchmarks, outperforming GLoVe on six classification tasks.
arXiv Detail & Related papers (2023-01-02T15:02:45Z)
Improving word mover's distance by leveraging self-attention matrix [7.934452214142754]
The proposed method is based on the Fused Gromov-Wasserstein distance, which simultaneously considers the similarity of the word embedding and the SAM for calculating the optimal transport between two sentences. Experiments demonstrate the proposed method enhances WMD and its variants in paraphrase identification with near-equivalent performance in semantic textual similarity.
arXiv Detail & Related papers (2022-11-11T14:25:08Z)
Describing Sets of Images with Textual-PCA [89.46499914148993]
We seek to semantically describe a set of images, capturing both the attributes of single images and the variations within the set. Our procedure is analogous to Principle Component Analysis, in which the role of projection vectors is replaced with generated phrases.
arXiv Detail & Related papers (2022-10-21T17:10:49Z)
Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning [80.36076044023581]
We present an efficient bi-encoder framework for named entity recognition (NER) We frame NER as a metric learning problem that maximizes the similarity between the vector representations of an entity mention and its type. A major challenge to this bi-encoder formulation for NER lies in separating non-entity spans from entity mentions.
arXiv Detail & Related papers (2022-08-30T23:19:04Z)
Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora [54.757845511368814]
The problem of comparing two bodies of text and searching for words that differ in their usage arises often in digital humanities and computational social science. This is commonly approached by training word embeddings on each corpus, aligning the vector spaces, and looking for words whose cosine distance in the aligned space is large. We propose an alternative approach that does not use vector space alignment, and instead considers the neighbors of each word.
arXiv Detail & Related papers (2021-12-28T23:46:00Z)
Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks [58.87961226278285]
We propose a self-supervised approach to model lexical semantic change. We show that our method can be used for the detection of semantic change with any alignment method. We illustrate the utility of our techniques using experimental results on three different datasets.
arXiv Detail & Related papers (2021-01-30T18:59:43Z)
Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains [51.91456788949489]
We propose a novel match method tailored for text matching in asymmetrical domains, called WD-Match. In WD-Match, a Wasserstein distance-based regularizer is defined to regularize the features vectors projected from different domains. The training process of WD-Match amounts to a game that minimizes the matching loss regularized by the Wasserstein distance.
arXiv Detail & Related papers (2020-10-15T12:52:09Z)
Principal Word Vectors [5.64434321651888]
We generalize principal component analysis for embedding words into a vector space. We show that the spread and the discriminability of the principal word vectors are higher than that of other word embedding methods.
arXiv Detail & Related papers (2020-07-09T08:29:57Z)
Discovering linguistic (ir)regularities in word embeddings through max-margin separating hyperplanes [0.0]
We show new methods for learning how related words are positioned relative to each other in word embedding spaces. Our model, SVMCos, is robust to a range of experimental choices when training word embeddings.
arXiv Detail & Related papers (2020-03-07T20:21:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.