DistilVPR: Cross-Modal Knowledge Distillation for Visual Place
Recognition
- URL: http://arxiv.org/abs/2312.10616v1
- Date: Sun, 17 Dec 2023 05:59:06 GMT
- Title: DistilVPR: Cross-Modal Knowledge Distillation for Visual Place
Recognition
- Authors: Sijie Wang, Rui She, Qiyu Kang, Xingchao Jian, Kai Zhao, Yang Song,
Wee Peng Tay
- Abstract summary: DistilVPR is a novel distillation pipeline for visual place recognition.
We propose leveraging feature relationships from multiple agents, including self-agents and cross-agents for teacher and student neural networks.
The experiments demonstrate that our proposed pipeline achieves state-of-the-art performance compared to other distillation baselines.
- Score: 27.742693995915808
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The utilization of multi-modal sensor data in visual place recognition (VPR)
has demonstrated enhanced performance compared to single-modal counterparts.
Nonetheless, integrating additional sensors comes with elevated costs and may
not be feasible for systems that demand lightweight operation, thereby
impacting the practical deployment of VPR. To address this issue, we resort to
knowledge distillation, which empowers single-modal students to learn from
cross-modal teachers without introducing additional sensors during inference.
Despite the notable advancements achieved by current distillation approaches,
the exploration of feature relationships remains an under-explored area. In
order to tackle the challenge of cross-modal distillation in VPR, we present
DistilVPR, a novel distillation pipeline for VPR. We propose leveraging feature
relationships from multiple agents, including self-agents and cross-agents for
teacher and student neural networks. Furthermore, we integrate various
manifolds, characterized by different space curvatures for exploring feature
relationships. This approach enhances the diversity of feature relationships,
including Euclidean, spherical, and hyperbolic relationship modules, thereby
enhancing the overall representational capacity. The experiments demonstrate
that our proposed pipeline achieves state-of-the-art performance compared to
other distillation baselines. We also conduct necessary ablation studies to
show design effectiveness. The code is released at:
https://github.com/sijieaaa/DistilVPR
Related papers
- Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching [53.05954114863596]
We propose a brand-new Deep Boosting Learning (DBL) algorithm for image-text matching.
An anchor branch is first trained to provide insights into the data properties.
A target branch is concurrently tasked with more adaptive margin constraints to further enlarge the relative distance between matched and unmatched samples.
arXiv Detail & Related papers (2024-04-28T08:44:28Z) - TSCM: A Teacher-Student Model for Vision Place Recognition Using Cross-Metric Knowledge Distillation [6.856317526681759]
Visual place recognition plays a pivotal role in autonomous exploration and navigation of mobile robots.
Existing methods overcome this by exploiting powerful yet large networks.
We propose a high-performance teacher and lightweight student distillation framework called TSCM.
arXiv Detail & Related papers (2024-04-02T02:29:41Z) - Accelerating exploration and representation learning with offline
pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.
We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z) - EmbedDistill: A Geometric Knowledge Distillation for Information
Retrieval [83.79667141681418]
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR)
We propose a novel distillation approach that leverages the relative geometry among queries and documents learned by the large teacher model.
We show that our approach successfully distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to 1/10th size asymmetric students that can retain 95-97% of the teacher performance.
arXiv Detail & Related papers (2023-01-27T22:04:37Z) - LightVessel: Exploring Lightweight Coronary Artery Vessel Segmentation
via Similarity Knowledge Distillation [6.544757635738911]
We propose textbfLightVessel, a Similarity Knowledge Distillation Framework, for lightweight coronary artery vessel segmentation.
FSD module for semantic-shift modeling; Adversarial Similarity Distillation (ASD) module for encouraging the student model to learn more pixel-wise semantic information.
Experiments conducted on Clinical Coronary Artery Vessel dataset demonstrate that LightVessel outperforms various knowledge distillation counterparts.
arXiv Detail & Related papers (2022-11-02T05:49:19Z) - Exploring Inter-Channel Correlation for Diversity-preserved
KnowledgeDistillation [91.56643684860062]
Inter-Channel Correlation for Knowledge Distillation(ICKD) is developed.
ICKD captures intrinsic distribution of the featurespace and sufficient diversity properties of features in the teacher network.
We are the first method based on knowl-edge distillation boosts ResNet18 beyond 72% Top-1 ac-curacy on ImageNet classification.
arXiv Detail & Related papers (2022-02-08T07:01:56Z) - Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision
Action Recognition [131.6328804788164]
We propose a framework, named Semantics-aware Adaptive Knowledge Distillation Networks (SAKDN), to enhance action recognition in vision-sensor modality (videos)
The SAKDN uses multiple wearable-sensors as teacher modalities and uses RGB videos as student modality.
arXiv Detail & Related papers (2020-09-01T03:38:31Z) - Differentiable Feature Aggregation Search for Knowledge Distillation [47.94874193183427]
We introduce the feature aggregation to imitate the multi-teacher distillation in the single-teacher distillation framework.
DFA is a two-stage Differentiable Feature Aggregation search method motivated by DARTS in neural architecture search.
Experimental results show that DFA outperforms existing methods on CIFAR-100 and CINIC-10 datasets.
arXiv Detail & Related papers (2020-08-02T15:42:29Z) - Learning Multiplicative Interactions with Bayesian Neural Networks for
Visual-Inertial Odometry [44.209301916028124]
This paper presents an end-to-end multi-modal learning approach for Visual-Inertial Odometry (VIO)
It is specifically designed to exploit sensor complementarity in the light of sensor degradation scenarios.
arXiv Detail & Related papers (2020-07-15T11:39:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.