Related papers: Online Enhanced Semantic Hashing: Towards Effective and Efficient Retrieval for Streaming Multi-Modal Data

Online Enhanced Semantic Hashing: Towards Effective and Efficient Retrieval for Streaming Multi-Modal Data

URL: http://arxiv.org/abs/2109.04260v1
Date: Thu, 9 Sep 2021 13:30:31 GMT
Title: Online Enhanced Semantic Hashing: Towards Effective and Efficient Retrieval for Streaming Multi-Modal Data
Authors: Xiao-Ming Wu, Xin Luo, Yu-Wei Zhan, Chen-Lu Ding, Zhen-Duo Chen, Xin-Shun Xu
Abstract summary: We propose a new model, termed Online enhAnced SemantIc haShing (OASIS) We design novel semantic-enhanced representation for data, which could help handle the new coming classes. Our method can exceed the state-of-the-art models.
Score: 21.157717777481572
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the vigorous development of multimedia equipment and applications, efficient retrieval of large-scale multi-modal data has become a trendy research topic. Thereinto, hashing has become a prevalent choice due to its retrieval efficiency and low storage cost. Although multi-modal hashing has drawn lots of attention in recent years, there still remain some problems. The first point is that existing methods are mainly designed in batch mode and not able to efficiently handle streaming multi-modal data. The second point is that all existing online multi-modal hashing methods fail to effectively handle unseen new classes which come continuously with streaming data chunks. In this paper, we propose a new model, termed Online enhAnced SemantIc haShing (OASIS). We design novel semantic-enhanced representation for data, which could help handle the new coming classes, and thereby construct the enhanced semantic objective function. An efficient and effective discrete online optimization algorithm is further proposed for OASIS. Extensive experiments show that our method can exceed the state-of-the-art models. For good reproducibility and benefiting the community, our code and data are already available in supplementary material and will be made publicly available.

Related papers

Efficient Multi-modal Long Context Learning for Training-free Adaptation [96.21248144937627]
This paper introduces Efficient Multi-Modal Long Context Learning (EMLoC)<n>It embeds demonstration examples directly into the model input.<n>It condenses long-context multimodal inputs into compact, task-specific memory representations.
arXiv Detail & Related papers (2025-05-26T10:49:44Z)
LLMs Can Evolve Continually on Modality for X-Modal Reasoning [62.2874638875554]
Existing methods rely heavily on modal-specific pretraining and joint-modal tuning, leading to significant computational burdens when expanding to new modalities. We propose PathWeave, a flexible and scalable framework with modal-Path sWitching and ExpAnsion abilities. PathWeave performs comparably to state-of-the-art MLLMs while concurrently reducing parameter training burdens by 98.73%.
arXiv Detail & Related papers (2024-10-26T13:19:57Z)
CLIP Multi-modal Hashing for Multimedia Retrieval [7.2683522480676395]
We propose a novel CLIP Multi-modal Hashing ( CLIPMH) method. Our method employs the CLIP framework to extract both text and vision features and then fuses them to generate hash code. Compared with state-of-the-art unsupervised and supervised multi-modal hashing methods, experiments reveal that the proposed CLIPMH can significantly improve performance.
arXiv Detail & Related papers (2024-10-10T10:13:48Z)
CLIP Multi-modal Hashing: A new baseline CLIPMH [4.057431980018267]
We propose a new baseline CLIP Multi-modal Hashing ( CLIPMH) method. It uses CLIP model to extract text and image features, and then fuse to generate hash code. In comparison to state-of-the-art unsupervised and supervised multi-modal hashing methods, experiments reveal that the proposed CLIPMH can significantly enhance performance.
arXiv Detail & Related papers (2023-08-22T21:29:55Z)
Improved Distribution Matching for Dataset Condensation [91.55972945798531]
We propose a novel dataset condensation method based on distribution matching. Our simple yet effective method outperforms most previous optimization-oriented methods with much fewer computational resources.
arXiv Detail & Related papers (2023-07-19T04:07:33Z)
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis. For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z)
Asymmetric Scalable Cross-modal Hashing [51.309905690367835]
Cross-modal hashing is a successful method to solve large-scale multimedia retrieval issue. We propose a novel Asymmetric Scalable Cross-Modal Hashing (ASCMH) to address these issues. Our ASCMH outperforms the state-of-the-art cross-modal hashing methods in terms of accuracy and efficiency.
arXiv Detail & Related papers (2022-07-26T04:38:47Z)
$\ extit{latent}$-GLAT: Glancing at Latent Variables for Parallel Text Generation [65.29170569821093]
parallel text generation has received widespread attention due to its success in generation efficiency. In this paper, we propose $textitlatent$-GLAT, which employs the discrete latent variables to capture word categorical information. Experiment results show that our method outperforms strong baselines without the help of an autoregressive model.
arXiv Detail & Related papers (2022-04-05T07:34:12Z)
Dynamic Multimodal Fusion [8.530680502975095]
Dynamic multimodal fusion (DynMM) is a new approach that adaptively fuses multimodal data and generates data-dependent forward paths during inference. Results on various multimodal tasks demonstrate the efficiency and wide applicability of our approach.
arXiv Detail & Related papers (2022-03-31T21:35:13Z)
Fast Class-wise Updating for Online Hashing [196.14748396106955]
This paper presents a novel supervised online hashing scheme, termed Fast Class-wise Updating for Online Hashing (FCOH) A class-wise updating method is developed to decompose the binary code learning and alternatively renew the hash functions in a class-wise fashion, which well addresses the burden on large amounts of training batches. To further achieve online efficiency, we propose a semi-relaxation optimization, which accelerates the online training by treating different binary constraints independently.
arXiv Detail & Related papers (2020-12-01T07:41:54Z)
Creating Something from Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing [132.22315429623575]
Cross-modal hashing (CMH) can map contents from different modalities, especially in vision and language, into the same space. There are two main frameworks for CMH, differing from each other in whether semantic supervision is required. In this paper, we propose a novel approach that enables guiding a supervised method using outputs produced by an unsupervised method.
arXiv Detail & Related papers (2020-04-01T08:32:15Z)
Deep Multi-View Enhancement Hashing for Image Retrieval [40.974719473643724]
This paper proposes a supervised multi-view hash model which can enhance the multi-view information through neural networks. The proposed method is systematically evaluated on the CIFAR-10, NUS-WIDE and MS-COCO datasets.
arXiv Detail & Related papers (2020-02-01T08:32:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.