On the Dimensionality of Sentence Embeddings
- URL: http://arxiv.org/abs/2310.15285v1
- Date: Mon, 23 Oct 2023 18:51:00 GMT
- Title: On the Dimensionality of Sentence Embeddings
- Authors: Hongwei Wang, Hongming Zhang, Dong Yu
- Abstract summary: We show that the optimal dimension of sentence embeddings is usually smaller than the default value.
We propose a two-step training method for sentence representation learning models, wherein the encoder and the pooler are optimized separately to mitigate the overall performance loss.
- Score: 56.86742006079451
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning sentence embeddings is a fundamental problem in natural language
processing. While existing research primarily focuses on enhancing the quality
of sentence embeddings, the exploration of sentence embedding dimensions is
limited. Here we present a comprehensive and empirical analysis of the
dimensionality of sentence embeddings. First, we demonstrate that the optimal
dimension of sentence embeddings is usually smaller than the default value.
Subsequently, to compress the dimension of sentence embeddings with minimum
performance degradation, we identify two components contributing to the overall
performance loss: the encoder's performance loss and the pooler's performance
loss. Therefore, we propose a two-step training method for sentence
representation learning models, wherein the encoder and the pooler are
optimized separately to mitigate the overall performance loss in low-dimension
scenarios. Experimental results on seven STS tasks and seven sentence
classification tasks demonstrate that our method significantly improves the
performance of low-dimensional sentence embeddings.
Related papers
- When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks [17.109522466982476]
We show that compressed representations of text can yield better performance in regression tasks.
Our results suggest that the success of interpretable compressed representations such as sentiment may be due to a regularising effect.
arXiv Detail & Related papers (2025-02-04T10:23:11Z) - Efficient Diffusion as Low Light Enhancer [63.789138528062225]
Reflectance-Aware Trajectory Refinement (RATR) is a simple yet effective module to refine the teacher trajectory using the reflectance component of images.
textbfReflectance-aware textbfDiffusion with textbfDistilled textbfTrajectory (textbfReDDiT) is an efficient and flexible distillation framework tailored for Low-Light Image Enhancement (LLIE)
arXiv Detail & Related papers (2024-10-16T08:07:18Z) - Anti-Collapse Loss for Deep Metric Learning Based on Coding Rate Metric [99.19559537966538]
DML aims to learn a discriminative high-dimensional embedding space for downstream tasks like classification, clustering, and retrieval.
To maintain the structure of embedding space and avoid feature collapse, we propose a novel loss function called Anti-Collapse Loss.
Comprehensive experiments on benchmark datasets demonstrate that our proposed method outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2024-07-03T13:44:20Z) - Advancing Semantic Textual Similarity Modeling: A Regression Framework with Translated ReLU and Smooth K2 Loss [3.435381469869212]
This paper presents an innovative regression framework for Sentence-BERT STS tasks.
It proposes two simple yet effective loss functions: Translated ReLU and Smooth K2 Loss.
Experimental results demonstrate that our method achieves convincing performance across seven established STS benchmarks.
arXiv Detail & Related papers (2024-06-08T02:52:43Z) - Evaluating Unsupervised Dimensionality Reduction Methods for Pretrained Sentence Embeddings [28.35953315232521]
Sentence embeddings produced by Pretrained Language Models (PLMs) have received wide attention from the NLP community.
High dimensionality of the sentence embeddings produced by PLMs is problematic when representing large numbers of sentences in memory- or compute-constrained devices.
We evaluate unsupervised dimensionality reduction methods to reduce the dimensionality of sentence embeddings produced by PLMs.
arXiv Detail & Related papers (2024-03-20T21:58:32Z) - Gradient constrained sharpness-aware prompt learning for vision-language
models [99.74832984957025]
This paper targets a novel trade-off problem in generalizable prompt learning for vision-language models (VLM)
By analyzing the loss landscapes of the state-of-the-art method and vanilla Sharpness-aware Minimization (SAM) based method, we conclude that the trade-off performance correlates to both loss value and loss sharpness.
We propose a novel SAM-based method for prompt learning, denoted as Gradient Constrained Sharpness-aware Context Optimization (GCSCoOp)
arXiv Detail & Related papers (2023-09-14T17:13:54Z) - Adaptive Cross Batch Normalization for Metric Learning [75.91093210956116]
Metric learning is a fundamental problem in computer vision.
We show that it is equally important to ensure that the accumulated embeddings are up to date.
In particular, it is necessary to circumvent the representational drift between the accumulated embeddings and the feature embeddings at the current training iteration.
arXiv Detail & Related papers (2023-03-30T03:22:52Z) - Empirical Evaluation of Pre-trained Transformers for Human-Level NLP:
The Role of Sample Size and Dimensionality [6.540382797747107]
RoBERTa consistently achieves top performance in human-level tasks, with PCA giving benefit over other reduction methods in better handling users that write longer texts.
A majority of the tasks achieve results comparable to the best performance with just $frac112$ of the embedding dimensions.
arXiv Detail & Related papers (2021-05-07T20:06:24Z) - Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence
Lip-Reading [96.48553941812366]
Lip-reading aims to infer the speech content from the lip movement sequence.
Traditional learning process of seq2seq models suffers from two problems.
We propose a novel pseudo-convolutional policy gradient (PCPG) based method to address these two problems.
arXiv Detail & Related papers (2020-03-09T09:12:26Z) - Structured Consistency Loss for semi-supervised semantic segmentation [1.4146420810689415]
The consistency loss has played a key role in solving problems in recent studies on semi-supervised learning.
We propose a structured consistency loss to address this limitation of extant studies.
We are the first to present the superiority of state-of-the-art semi-supervised learning in semantic segmentation.
arXiv Detail & Related papers (2020-01-14T07:08:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.