Towards Efficient Deep Hashing Retrieval: Condensing Your Data via
Feature-Embedding Matching
- URL: http://arxiv.org/abs/2305.18076v1
- Date: Mon, 29 May 2023 13:23:55 GMT
- Title: Towards Efficient Deep Hashing Retrieval: Condensing Your Data via
Feature-Embedding Matching
- Authors: Tao Feng, Jie Zhang, Peizheng Wang, Zhijie Wang
- Abstract summary: The expenses involved in training state-of-the-art deep hashing retrieval models have witnessed an increase.
The state-of-the-art dataset distillation methods can not expand to all deep hashing retrieval methods.
We propose an efficient condensation framework that addresses these limitations by matching the feature-embedding between synthetic set and real set.
- Score: 7.908244841289913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The expenses involved in training state-of-the-art deep hashing retrieval
models have witnessed an increase due to the adoption of more sophisticated
models and large-scale datasets. Dataset Distillation (DD) or Dataset
Condensation(DC) focuses on generating smaller synthetic dataset that retains
the original information. Nevertheless, existing DD methods face challenges in
maintaining a trade-off between accuracy and efficiency. And the
state-of-the-art dataset distillation methods can not expand to all deep
hashing retrieval methods. In this paper, we propose an efficient condensation
framework that addresses these limitations by matching the feature-embedding
between synthetic set and real set. Furthermore, we enhance the diversity of
features by incorporating the strategies of early-stage augmented models and
multi-formation. Extensive experiments provide compelling evidence of the
remarkable superiority of our approach, both in terms of performance and
efficiency, compared to state-of-the-art baseline methods.
Related papers
- KALAHash: Knowledge-Anchored Low-Resource Adaptation for Deep Hashing [19.667480064079083]
Existing deep hashing methods rely on abundant training data, leaving the more challenging scenario of low-resource adaptation relatively underexplored.
We introduce Class-Calibration LoRA, a novel plug-and-play approach that dynamically constructs low-rank adaptation by leveraging class-level textual knowledge embeddings.
Our proposed method, Knowledge- Anchored Low-Resource Adaptation Hashing (KALAHash), significantly boosts retrieval performance and achieves a 4x data efficiency in low-resource scenarios.
arXiv Detail & Related papers (2024-12-27T03:04:54Z) - Deep learning-based shot-domain seismic deblending [1.6411821807321063]
We make use of unblended shot gathers acquired at the end of each sail line.
By manually blending these data we obtain training data with good control of the ground truth.
We train a deep neural network using multi-channel inputs that include adjacent blended shot gathers.
arXiv Detail & Related papers (2024-09-13T07:32:31Z) - RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval [32.06421737874828]
Reconstruction Relations Embedded Hashing (RREH) is designed for semi-paired cross-modal retrieval tasks.
RREH assumes that multi-modal data share a common subspace.
anchors are sampled from paired data, which improves the efficiency of hash learning.
arXiv Detail & Related papers (2024-05-28T03:12:54Z) - Distribution-Aware Data Expansion with Diffusion Models [55.979857976023695]
We propose DistDiff, a training-free data expansion framework based on the distribution-aware diffusion model.
DistDiff consistently enhances accuracy across a diverse range of datasets compared to models trained solely on original data.
arXiv Detail & Related papers (2024-03-11T14:07:53Z) - Importance-Aware Adaptive Dataset Distillation [53.79746115426363]
Development of deep learning models is enabled by the availability of large-scale datasets.
dataset distillation aims to synthesize a compact dataset that retains the essential information from the large original dataset.
We propose an importance-aware adaptive dataset distillation (IADD) method that can improve distillation performance.
arXiv Detail & Related papers (2024-01-29T03:29:39Z) - Dataset Distillation via the Wasserstein Metric [35.32856617593164]
We introduce the Wasserstein distance, a metric grounded in optimal transport theory, to enhance distribution matching in dataset distillation.
Our method achieves new state-of-the-art performance across a range of high-resolution datasets.
arXiv Detail & Related papers (2023-11-30T13:15:28Z) - Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching.
Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z) - Learning Better with Less: Effective Augmentation for Sample-Efficient
Visual Reinforcement Learning [57.83232242068982]
Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms.
It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL.
This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
arXiv Detail & Related papers (2023-05-25T15:46:20Z) - Dataset Distillation: A Comprehensive Review [76.26276286545284]
dataset distillation (DD) aims to derive a much smaller dataset containing synthetic samples, based on which the trained models yield performance comparable with those trained on the original dataset.
This paper gives a comprehensive review and summary of recent advances in DD and its application.
arXiv Detail & Related papers (2023-01-17T17:03:28Z) - Accelerating Dataset Distillation via Model Augmentation [41.3027484667024]
We propose two model augmentation techniques, i.e. using early-stage models and parameter parameters to learn an informative synthetic set with significantly reduced training cost.
Our method achieves up to 20x speedup and comparable performance on par with state-of-the-art methods.
arXiv Detail & Related papers (2022-12-12T07:36:05Z) - Segmentation-guided Domain Adaptation for Efficient Depth Completion [3.441021278275805]
We propose an efficient depth completion model based on a vgg05-like CNN architecture and a semi-supervised domain adaptation approach.
In order to boost spatial coherence, we guide the learning process using segmentations as additional source of information.
Our approach improves on previous efficient and low parameter state of the art approaches while having a noticeably lower computational footprint.
arXiv Detail & Related papers (2022-10-14T13:01:25Z) - DC-BENCH: Dataset Condensation Benchmark [79.18718490863908]
This work provides the first large-scale standardized benchmark on dataset condensation.
It consists of a suite of evaluations to comprehensively reflect the generability and effectiveness of condensation methods.
The benchmark library is open-sourced to facilitate future research and application.
arXiv Detail & Related papers (2022-07-20T03:54:05Z) - CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE)
At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales.
We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z) - DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference [86.03382625531951]
DANCE is an automated simultaneous data-network co-optimization for efficient segmentation model training and inference.
It integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity.
Experiments and ablating studies demonstrate that DANCE can achieve "all-win" towards efficient segmentation.
arXiv Detail & Related papers (2021-07-16T04:58:58Z) - Making Online Sketching Hashing Even Faster [63.16042585506435]
We present a FasteR Online Sketching Hashing (FROSH) algorithm to sketch the data in a more compact form via an independent transformation.
We provide theoretical justification to guarantee that our proposed FROSH consumes less time and achieves a comparable sketching precision.
We also extend FROSH to its distributed implementation, namely DFROSH, to further reduce the training time cost of FROSH.
arXiv Detail & Related papers (2020-10-10T08:50:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.