SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression
- URL: http://arxiv.org/abs/2510.12474v1
- Date: Tue, 14 Oct 2025 13:04:22 GMT
- Title: SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression
- Authors: Biao Zhang, Lixin Chen, Tong Liu, Bo Zheng,
- Abstract summary: We propose a novel training framework named Sequential Matryoshka Embedding Compression (SMEC)<n>This framework introduces the Sequential Matryoshka Representation Learning(SMRL) method to mitigate gradient variance during training, the Adaptive Dimension Selection (ADS) module to reduce information degradation during dimension pruning, and the Selectable Cross-batch Memory (S-XBM) module to enhance unsupervised learning between high- and low-dimensional embeddings.<n> Experiments on image, text, and multimodal datasets demonstrate that SMEC achieves significant dimensionality reduction while maintaining performance.
- Score: 15.655201854308396
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) generate high-dimensional embeddings that capture rich semantic and syntactic information. However, high-dimensional embeddings exacerbate computational complexity and storage requirements, thereby hindering practical deployment. To address these challenges, we propose a novel training framework named Sequential Matryoshka Embedding Compression (SMEC). This framework introduces the Sequential Matryoshka Representation Learning(SMRL) method to mitigate gradient variance during training, the Adaptive Dimension Selection (ADS) module to reduce information degradation during dimension pruning, and the Selectable Cross-batch Memory (S-XBM) module to enhance unsupervised learning between high- and low-dimensional embeddings. Experiments on image, text, and multimodal datasets demonstrate that SMEC achieves significant dimensionality reduction while maintaining performance. For instance, on the BEIR dataset, our approach improves the performance of compressed LLM2Vec embeddings (256 dimensions) by 1.1 points and 2.7 points compared to the Matryoshka-Adaptor and Search-Adaptor models, respectively.
Related papers
- Efficient Learning of Sparse Representations from Interactions [9.381985901356922]
We propose a training strategy for learning high-dimensional sparse embedding layers in place of conventional dense ones.<n>We modified the production-grade collaborative filtering autoencoder ELSA, achieving up to 10x reduction in embedding size with no loss of recommendation accuracy.
arXiv Detail & Related papers (2026-02-10T16:09:58Z) - Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification [69.87877580725768]
Multimodal Visual Surrogate Compression (MVSC) learns to compress and adapt large 3D sMRI volumes into compact 2D features.<n>MVSC has two key components: a Volume Context that captures global cross-slice context under textual guidance, and an Adaptive Slice Fusion module that aggregates slice-level information in a text-enhanced, patch-wise manner.
arXiv Detail & Related papers (2026-01-29T13:05:46Z) - Re-Densification Meets Cross-Scale Propagation: Real-Time Neural Compression of LiDAR Point Clouds [83.39320394656855]
LiDAR point clouds are fundamental to various applications, yet high-precision scans incur substantial storage and transmission overhead.<n>Existing methods typically convert unordered points into hierarchical octree or voxel structures for dense-to-sparse predictive coding.<n>Our framework comprises two lightweight modules. First, the Geometry Re-Densification Module re-densifies encoded sparse geometry, extracts features at denser scale, and then re-sparsifies the features for predictive coding.
arXiv Detail & Related papers (2025-08-28T06:36:10Z) - LatentLLM: Attention-Aware Joint Tensor Compression [50.33925662486034]
Large language models (LLMs) and large multi-modal models (LMMs) require a massive amount of computational and memory resources.<n>We propose a new framework to convert such LLMs/LMMs into a reduced-dimension latent structure.
arXiv Detail & Related papers (2025-05-23T22:39:54Z) - Dynamic Memory-enhanced Transformer for Hyperspectral Image Classification [3.5093938502961763]
Hyperspectral image (HSI) classification remains a challenging task due to the intricate spatial-spectral correlations.<n>Existing transformer models excel in capturing long-range dependencies but often suffer from information redundancy and attention inefficiencies.<n>MemFormer introduces a memory-enhanced multi-head attention mechanism that iteratively refines a dynamic memory module.<n>A dynamic memory enrichment strategy progressively captures complex spatial and spectral dependencies, leading to more expressive feature representations.
arXiv Detail & Related papers (2025-04-17T17:43:34Z) - Towards Scalable Semantic Representation for Recommendation [65.06144407288127]
Mixture-of-Codes is proposed to construct semantic IDs based on large language models (LLMs)
Our method achieves superior discriminability and dimension robustness scalability, leading to the best scale-up performance in recommendations.
arXiv Detail & Related papers (2024-10-12T15:10:56Z) - Maximum Manifold Capacity Representations in State Representation Learning [8.938418994111716]
manifold-based self-supervised learning (SSL) builds on the manifold hypothesis.
DeepInfomax with an unbalanced atlas (DIM-UA) has emerged as a powerful tool.
MMCR presents a new frontier for SSL by optimizing class separability via manifold compression.
We present an innovative integration of MMCR into existing SSL methods, incorporating a discerning regularization strategy.
arXiv Detail & Related papers (2024-05-22T17:19:30Z) - Unleashing Network Potentials for Semantic Scene Completion [50.95486458217653]
This paper proposes a novel SSC framework - Adrial Modality Modulation Network (AMMNet)
AMMNet introduces two core modules: a cross-modal modulation enabling the interdependence of gradient flows between modalities, and a customized adversarial training scheme leveraging dynamic gradient competition.
Extensive experimental results demonstrate that AMMNet outperforms state-of-the-art SSC methods by a large margin.
arXiv Detail & Related papers (2024-03-12T11:48:49Z) - 2D Matryoshka Sentence Embeddings [11.682642816354418]
We introduce a novel sentence embedding model called textitTwo-dimensional Matryoshka Sentence Embedding (2DMSE)footnote.<n>It supports elastic settings for both embedding sizes and Transformer layers, offering greater flexibility and efficiency than MRL.<n>The experimental results demonstrate the effectiveness of our proposed model in dynamically supporting different embedding sizes and Transformer layers.
arXiv Detail & Related papers (2024-02-22T18:35:05Z) - DLME: Deep Local-flatness Manifold Embedding [41.86924171938867]
Deep Local-flatness Manifold Embedding (DLME) is a novel ML framework to obtain reliable manifold embedding by reducing distortion.
In the experiments, by showing the effectiveness of DLME on downstream classification, clustering, and visualization tasks, our results show that DLME outperforms SOTA ML & contrastive learning (CL) methods.
arXiv Detail & Related papers (2022-07-07T08:46:17Z) - Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation
for Reference-based Super-Resolution [48.093500219958834]
We propose an Accelerated Multi-Scale Aggregation network (AMSA) for Reference-based Super-Resolution.
The proposed AMSA achieves superior performance over state-of-the-art approaches on both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2022-01-12T08:40:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.