Related papers: Static Word Embeddings for Sentence Semantic Representation

Static Word Embeddings for Sentence Semantic Representation

URL: http://arxiv.org/abs/2506.04624v1
Date: Thu, 05 Jun 2025 04:33:10 GMT
Title: Static Word Embeddings for Sentence Semantic Representation
Authors: Takashi Wada, Yuki Hirakawa, Ryotaro Shimizu, Takahiro Kawashima, Yuki Saito,
Abstract summary: We propose new static word embeddings optimised for sentence semantic representation.<n>We first extract word embeddings from a pre-trained Sentence Transformer, and improve them with sentence-level principal component analysis.<n>During inference, we represent sentences by simply averaging word embeddings, which requires little computational cost.
Score: 9.879896956915598
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose new static word embeddings optimised for sentence semantic representation. We first extract word embeddings from a pre-trained Sentence Transformer, and improve them with sentence-level principal component analysis, followed by either knowledge distillation or contrastive learning. During inference, we represent sentences by simply averaging word embeddings, which requires little computational cost. We evaluate models on both monolingual and cross-lingual tasks and show that our model substantially outperforms existing static models on sentence semantic tasks, and even rivals a basic Sentence Transformer model (SimCSE) on some data sets. Lastly, we perform a variety of analyses and show that our method successfully removes word embedding components that are irrelevant to sentence semantics, and adjusts the vector norms based on the influence of words on sentence semantics.

Related papers

Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word Substitutes [5.065947993017158]
We introduce ConCat, a simple augmented approach which utilizes the original sentence to bolster contextual information sent to the model.<n>Our study includes a quantitative evaluation, measured via sentence similarity and task performance.<n>We also conduct a qualitative human analysis to validate that users prefer the substitutions proposed by our method, as opposed to previous methods.
arXiv Detail & Related papers (2025-02-06T16:05:50Z)
Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings [17.803726860514193]
Detection of semantic variation of words is an important task for various NLP applications. We argue that mean representations alone cannot accurately capture such semantic variations. We propose a method that uses the entire cohort of the contextualised embeddings of the target word.
arXiv Detail & Related papers (2023-05-15T13:58:21Z)
Relational Sentence Embedding for Flexible Semantic Matching [86.21393054423355]
We present Sentence Embedding (RSE), a new paradigm to discover further the potential of sentence embeddings. RSE is effective and flexible in modeling sentence relations and outperforms a series of state-of-the-art embedding methods.
arXiv Detail & Related papers (2022-12-17T05:25:17Z)
Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models [12.0190584907439]
We propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models. We show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks.
arXiv Detail & Related papers (2022-07-13T02:28:08Z)
TransDrift: Modeling Word-Embedding Drift using Transformer [8.707217592903735]
We propose TransDrift, a transformer-based prediction model for word embeddings. Our model accurately learns the dynamics of the embedding drift and predicts the future embedding. Our embeddings lead to superior performance compared to the previous methods.
arXiv Detail & Related papers (2022-06-16T10:48:26Z)
Disentangling Semantics and Syntax in Sentence Embeddings with Pre-trained Language Models [32.003787396501075]
ParaBART is a semantic sentence embedding model that learns to disentangle semantics and syntax in sentence embeddings obtained by pre-trained language models. ParaBART is trained to perform syntax-guided paraphrasing, based on a source sentence that shares semantics with the target paraphrase, and a parse tree that specifies the target syntax.
arXiv Detail & Related papers (2021-04-11T21:34:46Z)
Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding. Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z)
On the Sentence Embeddings from Pre-trained Language Models [78.45172445684126]
In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited. We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity. We propose to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective.
arXiv Detail & Related papers (2020-11-02T13:14:57Z)
Unsupervised Extractive Summarization by Pre-training Hierarchical Transformers [107.12125265675483]
Unsupervised extractive document summarization aims to select important sentences from a document without using labeled summaries during training. Existing methods are mostly graph-based with sentences as nodes and edge weights measured by sentence similarities. We find that transformer attentions can be used to rank sentences for unsupervised extractive summarization.
arXiv Detail & Related papers (2020-10-16T08:44:09Z)
Unsupervised Distillation of Syntactic Information from Contextualized Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations. To this end, we automatically generate groups of sentences which are structurally similar but semantically different. We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z)
Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size. We propose a fully compositional output embedding layer for language models. To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.