Embedding Compression for Efficient Re-Identification
- URL: http://arxiv.org/abs/2405.14730v1
- Date: Thu, 23 May 2024 15:57:11 GMT
- Title: Embedding Compression for Efficient Re-Identification
- Authors: Luke McDermott,
- Abstract summary: ReID algorithms aim to map new observations of an object to previously recorded instances.
We benchmark quantization-aware-training along with three different dimension reduction methods.
We find that ReID embeddings can be compressed by up to 96x with minimal drop in performance.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real world re-identfication (ReID) algorithms aim to map new observations of an object to previously recorded instances. These systems are often constrained by quantity and size of the stored embeddings. To combat this scaling problem, we attempt to shrink the size of these vectors by using a variety of compression techniques. In this paper, we benchmark quantization-aware-training along with three different dimension reduction methods: iterative structured pruning, slicing the embeddings at initialize, and using low rank embeddings. We find that ReID embeddings can be compressed by up to 96x with minimal drop in performance. This implies that modern re-identification paradigms do not fully leverage the high dimensional latent space, opening up further research to increase the capabilities of these systems.
Related papers
- High-Fidelity and Generalizable Neural Surface Reconstruction with Sparse Feature Volumes [50.83282258807327]
Generalizable neural surface reconstruction has become a compelling technique to reconstruct from few images without per-scene optimization.<n>We present a sparse representation method, that maximizes memory efficiency and enables significantly higher resolution reconstructions on standard hardware.
arXiv Detail & Related papers (2025-07-08T12:50:39Z) - Redundancy, Isotropy, and Intrinsic Dimensionality of Prompt-based Text Embeddings [9.879314903531286]
Prompt-based text embedding models generate task-specific embeddings upon receiving tailored prompts.<n>Our experiments show that even a naive dimensionality reduction, which keeps only the first 25% of the dimensions of the embeddings, results in a very slight performance degradation.<n>For classification and clustering, even when embeddings are reduced to less than 0.5% of the original dimensionality the performance degradation is very small.
arXiv Detail & Related papers (2025-06-02T08:50:38Z) - Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding [56.066799081747845]
The ever-growing size of neural networks poses serious challenges on resource-constrained devices.<n>We propose a novel post-training compression framework that combines rate-aware quantization with entropy coding.<n>Our method allows for very fast decoding and is compatible with arbitrary quantization grids.
arXiv Detail & Related papers (2025-05-24T15:52:49Z) - A Universal Framework for Compressing Embeddings in CTR Prediction [68.27582084015044]
We introduce a Model-agnostic Embedding Compression (MEC) framework that compresses embedding tables by quantizing pre-trained embeddings.
Our approach consists of two stages: first, we apply popularity-weighted regularization to balance code distribution between high- and low-frequency features.
Experiments on three datasets reveal that our method reduces memory usage by over 50x while maintaining or improving recommendation performance.
arXiv Detail & Related papers (2025-02-21T10:12:34Z) - Sparser Training for On-Device Recommendation Systems [50.74019319100728]
We propose SparseRec, a lightweight embedding method based on Dynamic Sparse Training (DST)
It avoids dense gradients during backpropagation by sampling a subset of important vectors.
arXiv Detail & Related papers (2024-11-19T03:48:48Z) - Scalable Dynamic Embedding Size Search for Streaming Recommendation [54.28404337601801]
Real-world recommender systems often operate in streaming recommendation scenarios.
Number of users and items continues to grow, leading to substantial storage resource consumption.
We learn Lightweight Embeddings for streaming recommendation, called SCALL, which can adaptively adjust the embedding sizes of users/items.
arXiv Detail & Related papers (2024-07-22T06:37:24Z) - AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation [51.143540967290114]
We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth computation and estimation.
This is achieved by reversing, or undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame.
arXiv Detail & Related papers (2023-10-15T05:15:45Z) - Error Feedback Can Accurately Compress Preconditioners [43.60787513716217]
Leveraging second-order information about the loss at the scale of deep networks is one of the main lines of approach for improving the performance of currents for deep learning.
Yet, existing approaches for accurate full-matrix preconditioning, such as Full-Matrix Adagrad (GGT) or Matrix-Free Approximate Curvature (M-FAC) suffer from massive storage costs when applied even to small-scale models.
In this paper, we address this issue via a novel and efficient error-feedback technique that can be applied to compress preconditioners by up to two orders of magnitude in practice, without loss of convergence.
arXiv Detail & Related papers (2023-06-09T17:58:47Z) - Adaptive Cross Batch Normalization for Metric Learning [75.91093210956116]
Metric learning is a fundamental problem in computer vision.
We show that it is equally important to ensure that the accumulated embeddings are up to date.
In particular, it is necessary to circumvent the representational drift between the accumulated embeddings and the feature embeddings at the current training iteration.
arXiv Detail & Related papers (2023-03-30T03:22:52Z) - DimenFix: A novel meta-dimensionality reduction method for feature
preservation [64.0476282000118]
We propose a novel meta-method, DimenFix, which can be operated upon any base dimensionality reduction method that involves a gradient-descent-like process.
By allowing users to define the importance of different features, which is considered in dimensionality reduction, DimenFix creates new possibilities to visualize and understand a given dataset.
arXiv Detail & Related papers (2022-11-30T05:35:22Z) - Meta-Learning Sparse Compression Networks [44.30642520752235]
Recent work in Deep Learning has re-imagined the representation of data as functions mapping from a coordinate space to an underlying continuous signal.
Recent work on such Implicit Neural Representations (INRs) has shown that - following careful architecture search - INRs can outperform established compression methods.
arXiv Detail & Related papers (2022-05-18T14:31:43Z) - On Geodesic Distances and Contextual Embedding Compression for Text
Classification [0.0]
In some memory-constrained settings, it can be advantageous to have smaller contextual embeddings.
We investigate the efficacy of projecting contextual embedding data onto a manifold, and using nonlinear dimensionality reduction techniques to compress these embeddings.
In particular, we propose a novel post-processing approach, applying a combination of Isomap and PCA.
arXiv Detail & Related papers (2021-04-22T19:30:06Z) - The Effectiveness of Memory Replay in Large Scale Continual Learning [42.67483945072039]
We study continual learning in the large scale setting where tasks in the input sequence are not limited to classification, and the outputs can be of high dimension.
Existing methods usually replay only the input-output pairs.
We propose to replay the activation of the intermediate layers in addition to the input-output pairs.
arXiv Detail & Related papers (2020-10-06T01:23:12Z) - A Variational Information Bottleneck Based Method to Compress Sequential
Networks for Human Action Recognition [9.414818018857316]
We propose a method to effectively compress Recurrent Neural Networks (RNNs) used for Human Action Recognition (HAR)
We use a Variational Information Bottleneck (VIB) theory-based pruning approach to limit the information flow through the sequential cells of RNNs to a small subset.
We combine our pruning method with a specific group-lasso regularization technique that significantly improves compression.
It is shown that our method achieves over 70 times greater compression than the nearest competitor with comparable accuracy for the task of action recognition on UCF11.
arXiv Detail & Related papers (2020-10-03T12:41:51Z) - OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression [77.8842824702423]
We present a novel deep compression algorithm to reduce the memory footprint of LiDAR point clouds.
Our method exploits the sparsity and structural redundancy between points to reduce the memory footprint.
Our algorithm can be used to reduce the onboard and offboard storage of LiDAR points for applications such as self-driving cars.
arXiv Detail & Related papers (2020-05-14T17:48:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.