Related papers: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation

Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation

URL: http://arxiv.org/abs/2503.01776v3
Date: Sat, 19 Apr 2025 07:14:13 GMT
Title: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
Authors: Tiansheng Wen, Yifei Wang, Zequn Zeng, Zhong Peng, Yudi Su, Xinyang Liu, Bo Chen, Hongwei Liu, Stefanie Jegelka, Chenyu You,
Abstract summary: Matryoshka Representation Learning (MRL) recently emerged as a solution for adaptive embedding lengths.<n>We show that sparse coding offers a compelling alternative for achieving adaptive representation with minimal overhead and higher fidelity.
Score: 42.590255022001145
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Many large-scale systems rely on high-quality deep representations (embeddings) to facilitate tasks like retrieval, search, and generative modeling. Matryoshka Representation Learning (MRL) recently emerged as a solution for adaptive embedding lengths, but it requires full model retraining and suffers from noticeable performance degradations at short lengths. In this paper, we show that sparse coding offers a compelling alternative for achieving adaptive representation with minimal overhead and higher fidelity. We propose Contrastive Sparse Representation (CSR), a method that sparsifies pre-trained embeddings into a high-dimensional but selectively activated feature space. By leveraging lightweight autoencoding and task-aware contrastive objectives, CSR preserves semantic quality while allowing flexible, cost-effective inference at different sparsity levels. Extensive experiments on image, text, and multimodal benchmarks demonstrate that CSR consistently outperforms MRL in terms of both accuracy and retrieval speed-often by large margins-while also cutting training time to a fraction of that required by MRL. Our results establish sparse coding as a powerful paradigm for adaptive representation learning in real-world applications where efficiency and fidelity are both paramount. Code is available at https://github.com/neilwen987/CSR_Adaptive_Rep

Related papers

Prompt-Tuned LLM-Augmented DRL for Dynamic O-RAN Network Slicing [5.62872273155603]
Large Language Models (LLMs) structure unorganized network feedback into meaningful latent representations.<n>In O-RAN slicing, concepts like SNR, power levels and throughput are semantically related.<n>We introduce a contextualization-based adaptation method that integrates learnable prompts into an LLM-augmented DRL framework.
arXiv Detail & Related papers (2025-05-31T14:12:56Z)
Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps [16.84310001807895]
This paper introduces a model-agnostic approach that can be applied to A-RAG methods.<n>Specifically, we use cache access and parallel generation to speed up the prefilling and decoding stages respectively.
arXiv Detail & Related papers (2025-05-19T05:39:38Z)
MMRL++: Parameter-Efficient and Interaction-Aware Representation Learning for Vision-Language Models [4.828668077793944]
Multi-Modal Representation Learning generates space tokens projected into both text and image encoders as representation tokens.<n>MML++ is a parameter-efficient and interaction-aware extension that significantly reduces trainable parameters.<n> experiments on 15 datasets demonstrate that MMRL and MMRL++ consistently outperform state-of-the-art methods.
arXiv Detail & Related papers (2025-05-15T08:43:53Z)
Process Supervision-Guided Policy Optimization for Code Generation [15.943210767010045]
Reinforcement learning (RL) with unit test feedback has enhanced large language models' (LLMs) code generation, but relies on sparse rewards provided only after complete code evaluation. We propose a Process Reward Model (PRM) that delivers dense, line-level feedback on code correctness during generation, mimicking human code refinement.
arXiv Detail & Related papers (2024-10-23T07:22:33Z)
In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Efficient and Versatile Robust Fine-Tuning of Zero-shot Models [34.27380518351181]
We introduce Robust Adapter (R-Adapter), a novel method for fine-tuning zero-shot models to downstream tasks. Our method integrates lightweight modules into the pre-trained model and employs novel self-ensemble techniques to boost OOD robustness and reduce storage expenses substantially. Our experiments demonstrate that R-Adapter achieves state-of-the-art performance across a diverse set of tasks, tuning only 13% of the parameters of the CLIP encoders.
arXiv Detail & Related papers (2024-08-11T11:37:43Z)
REP: Resource-Efficient Prompting for Rehearsal-Free Continual Learning [23.92661395403251]
Recent rehearsal-free methods, guided by prompts, excel in vision-related continual learning (CL) with drifting data but lack resource efficiency.<n>We introduce Resource-Efficient Prompting (REP), which improves the computational and memory efficiency of prompt-based rehearsal-free methods.<n>Our approach employs swift prompt selection to refine input data using a carefully provisioned model.
arXiv Detail & Related papers (2024-06-07T09:17:33Z)
Accelerating Inference of Retrieval-Augmented Generation via Sparse Context Selection [28.15184715270483]
Large language models (LLMs) augmented with retrieval exhibit robust performance and extensive versatility. We propose a novel paradigm named Sparse RAG, which seeks to cut costs through sparsity. Sparse RAG encodes retrieved documents in parallel, which eliminates latency introduced by long-range attention of retrieved documents.
arXiv Detail & Related papers (2024-05-25T11:10:04Z)
Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control [66.78146440275093]
Learned retrieval (LSR) is a family of neural methods that encode queries and documents into sparse lexical vectors. We explore the application of LSR to the multi-modal domain, with a focus on text-image retrieval. Current approaches like LexLIP and STAIR require complex multi-step training on massive datasets. Our proposed approach efficiently transforms dense vectors from a frozen dense model into sparse lexical vectors.
arXiv Detail & Related papers (2024-02-27T14:21:56Z)
Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach [58.57026686186709]
We introduce the Convolutional Transformer layer (ConvFormer) and propose a ConvFormer-based Super-Resolution network (CFSR) CFSR inherits the advantages of both convolution-based and transformer-based approaches. Experiments demonstrate that CFSR strikes an optimal balance between computational cost and performance.
arXiv Detail & Related papers (2024-01-11T03:08:00Z)
ViR: Towards Efficient Vision Retention Backbones [97.93707844681893]
We propose a new class of computer vision models, dubbed Vision Retention Networks (ViR) ViR has dual parallel and recurrent formulations, which strike an optimal balance between fast inference and parallel training with competitive performance. We have validated the effectiveness of ViR through extensive experiments with different dataset sizes and various image resolutions.
arXiv Detail & Related papers (2023-10-30T16:55:50Z)
Low-Resolution Self-Attention for Semantic Segmentation [93.30597515880079]
We introduce the Low-Resolution Self-Attention (LRSA) mechanism to capture global context at a significantly reduced computational cost.<n>Our approach involves computing self-attention in a fixed low-resolution space regardless of the input image's resolution.<n>We demonstrate the effectiveness of our LRSA approach by building the LRFormer, a vision transformer with an encoder-decoder structure.
arXiv Detail & Related papers (2023-10-08T06:10:09Z)
Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning [3.308743964406687]
$k$-Step Latent (KSL) is a representation learning method that enforces temporal consistency of representations. KSL produces encoders that generalize better to new tasks unseen during training.
arXiv Detail & Related papers (2021-10-11T00:16:43Z)
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL [84.14947307790361]
We propose an efficient algorithm, called ReLEX, for representation learning in both online and offline reinforcement learning. We show that the online version of ReLEX, called Re-UCB, always performs no worse than the state-of-the-art algorithm without representation selection. For the offline counterpart, ReLEX-LCB, we show that the algorithm can find the optimal policy if the representation class can cover the state-action space.
arXiv Detail & Related papers (2021-06-22T17:16:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.