Related papers: Cross Modal Retrieval with Querybank Normalisation

Cross Modal Retrieval with Querybank Normalisation

URL: http://arxiv.org/abs/2112.12777v1
Date: Thu, 23 Dec 2021 18:51:58 GMT
Title: Cross Modal Retrieval with Querybank Normalisation
Authors: Simion-Vlad Bogolin, Ioana Croitoru, Hailin Jin, Yang Liu, Samuel Albanie
Abstract summary: We show that state-of-the-art joint embeddings suffer from the longstanding hubness problem. We formulate a simple but effective framework that re-normalises query similarities to account for hubs in the embedding space. We show that QB-Norm works effectively without concurrent access to any test set queries.
Score: 41.877255953069074
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Profiting from large-scale training datasets, advances in neural architecture design and efficient inference, joint embeddings have become the dominant approach for tackling cross-modal retrieval. In this work we first show that, despite their effectiveness, state-of-the-art joint embeddings suffer significantly from the longstanding hubness problem in which a small number of gallery embeddings form the nearest neighbours of many queries. Drawing inspiration from the NLP literature, we formulate a simple but effective framework called Querybank Normalisation (QB-Norm) that re-normalises query similarities to account for hubs in the embedding space. QB-Norm improves retrieval performance without requiring retraining. Differently from prior work, we show that QB-Norm works effectively without concurrent access to any test set queries. Within the QB-Norm framework, we also propose a novel similarity normalisation method, the Dynamic Inverted Softmax, that is significantly more robust than existing approaches. We showcase QB-Norm across a range of cross modal retrieval models and benchmarks where it consistently enhances strong baselines beyond the state of the art. Code is available at https://vladbogo.github.io/QB-Norm/.

Related papers

Hubness Reduction with Dual Bank Sinkhorn Normalization for Cross-Modal Retrieval [12.329352187335312]
Hubness is a phenomenon where a small number of targets frequently appear as nearest neighbors to numerous queries.<n>Despite several proposed methods to reduce hubness, their underlying mechanisms remain poorly understood.<n>We propose a probability-balancing framework for more effective hubness reduction.
arXiv Detail & Related papers (2025-08-04T15:45:48Z)
Tree-Based Text Retrieval via Hierarchical Clustering in RAGFrameworks: Application on Taiwanese Regulations [0.0]
We propose a hierarchical clustering-based retrieval method that eliminates the need to predefine k.<n>Our approach maintains the accuracy and relevance of system responses while adaptively selecting semantically relevant content.<n>Our framework is simple to implement and easily integrates with existing RAG pipelines, making it a practical solution for real-world applications under limited resources.
arXiv Detail & Related papers (2025-06-16T15:34:29Z)
Reinforced Model Merging [53.84354455400038]
We present an innovative framework termed Reinforced Model Merging (RMM), which encompasses an environment and agent tailored for merging tasks. By utilizing data subsets during the evaluation process, we addressed the bottleneck in the reward feedback phase, thereby accelerating RMM by up to 100 times.
arXiv Detail & Related papers (2025-03-27T08:52:41Z)
NeighborRetr: Balancing Hub Centrality in Cross-Modal Retrieval [15.409022911063241]
NeighborRetr is a novel method that balances the learning of hubs and adaptively adjusts the relations of various kinds of neighbors. We show that NeighborRetr achieves state-of-the-art results on multiple cross-modal retrieval benchmarks.
arXiv Detail & Related papers (2025-03-13T16:33:55Z)
Nearest Neighbor Normalization Improves Multimodal Retrieval [30.076028359751614]
We present a method for correcting errors in trained contrastive image-text retrieval models with no additional training, called Nearest Neighbor Normalization (NNN) NNN requires a reference database, but does not require any training on this database, and can even increase the retrieval accuracy of a model after finetuning.
arXiv Detail & Related papers (2024-10-31T16:44:10Z)
RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search [11.069814476661827]
Cross-modal ANNS aims to use the data vector from one modality to retrieve the most similar items from another. State-of-the-art ANNS approaches suffer poor performance for OOD workloads. We propose pRojected bipartite Graph (RoarGraph), an efficient ANNS graph index built under the guidance of query distribution.
arXiv Detail & Related papers (2024-08-16T06:48:16Z)
Balance Act: Mitigating Hubness in Cross-Modal Retrieval with Query and Gallery Banks [5.164924773752648]
Hubness is a phenomenon where a small number of gallery data points are frequently retrieved, resulting in a decline in retrieval performance. We show the necessity of incorporating both the gallery and query data for addressing hubness as hubs always exhibit high similarity with gallery and query data. We present extensive experimental results on diverse language-grounded benchmarks, including text-image, text-video, and text-audio.
arXiv Detail & Related papers (2023-10-17T22:10:17Z)
Learnable Pillar-based Re-ranking for Image-Text Retrieval [119.9979224297237]
Image-text retrieval aims to bridge the modality gap and retrieve cross-modal content based on semantic similarities. Re-ranking, a popular post-processing practice, has revealed the superiority of capturing neighbor relations in single-modality retrieval tasks. We propose a novel learnable pillar-based re-ranking paradigm for image-text retrieval.
arXiv Detail & Related papers (2023-04-25T04:33:27Z)
Slimmable Domain Adaptation [112.19652651687402]
We introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank. Our framework surpasses other competing approaches by a very large margin on multiple benchmarks.
arXiv Detail & Related papers (2022-06-14T06:28:04Z)
Autoregressive Search Engines: Generating Substrings as Document Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers. Previous work has explored ways to partition the search space into hierarchical structures. In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z)
Improved Branch and Bound for Neural Network Verification via Lagrangian Decomposition [161.09660864941603]
We improve the scalability of Branch and Bound (BaB) algorithms for formally proving input-output properties of neural networks. We present a novel activation-based branching strategy and a BaB framework, named Branch and Dual Network Bound (BaDNB) BaDNB outperforms previous complete verification systems by a large margin, cutting average verification times by factors up to 50 on adversarial properties.
arXiv Detail & Related papers (2021-04-14T09:22:42Z)
CM-NAS: Cross-Modality Neural Architecture Search for Visible-Infrared Person Re-Identification [102.89434996930387]
VI-ReID aims to match cross-modality pedestrian images, breaking through the limitation of single-modality person ReID in dark environment. Existing works manually design various two-stream architectures to separately learn modality-specific and modality-sharable representations. We propose a novel method, named Cross-Modality Neural Architecture Search (CM-NAS)
arXiv Detail & Related papers (2021-01-21T07:07:00Z)
Multi-task Retrieval for Knowledge-Intensive Tasks [21.725935960568027]
We propose a multi-task trained model for neural retrieval. Our approach not only outperforms previous methods in the few-shot setting, but also rivals specialised neural retrievers. With the help of our retriever, we improve existing models for downstream tasks and closely match or improve the state of the art on multiple benchmarks.
arXiv Detail & Related papers (2021-01-01T00:16:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.