EvDistill: Asynchronous Events to End-task Learning via Bidirectional
Reconstruction-guided Cross-modal Knowledge Distillation
- URL: http://arxiv.org/abs/2111.12341v1
- Date: Wed, 24 Nov 2021 08:48:16 GMT
- Title: EvDistill: Asynchronous Events to End-task Learning via Bidirectional
Reconstruction-guided Cross-modal Knowledge Distillation
- Authors: Lin Wang, Yujeong Chae, Sung-Hoon Yoon, Tae-Kyun Kim and Kuk-Jin Yoon
- Abstract summary: Event cameras sense per-pixel intensity changes and produce asynchronous event streams with high dynamic range and less motion blur.
We propose a novel approach, called bfEvDistill, to learn a student network on the unlabeled and unpaired event data.
We show that EvDistill achieves significantly better results than the prior works and KD with only events and APS frames.
- Score: 61.33010904301476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Event cameras sense per-pixel intensity changes and produce asynchronous
event streams with high dynamic range and less motion blur, showing advantages
over conventional cameras. A hurdle of training event-based models is the lack
of large qualitative labeled data. Prior works learning end-tasks mostly rely
on labeled or pseudo-labeled datasets obtained from the active pixel sensor
(APS) frames; however, such datasets' quality is far from rivaling those based
on the canonical images. In this paper, we propose a novel approach, called
\textbf{EvDistill}, to learn a student network on the unlabeled and unpaired
event data (target modality) via knowledge distillation (KD) from a teacher
network trained with large-scale, labeled image data (source modality). To
enable KD across the unpaired modalities, we first propose a bidirectional
modality reconstruction (BMR) module to bridge both modalities and
simultaneously exploit them to distill knowledge via the crafted pairs, causing
no extra computation in the inference. The BMR is improved by the end-tasks and
KD losses in an end-to-end manner. Second, we leverage the structural
similarities of both modalities and adapt the knowledge by matching their
distributions. Moreover, as most prior feature KD methods are uni-modality and
less applicable to our problem, we propose to leverage an affinity graph KD
loss to boost the distillation. Our extensive experiments on semantic
segmentation and object recognition demonstrate that EvDistill achieves
significantly better results than the prior works and KD with only events and
APS frames.
Related papers
- Relational Representation Distillation [6.24302896438145]
We introduce Representation Distillation (RRD) to explore and reinforce relationships between teacher and student models.
Inspired by self-supervised learning principles, it uses a relaxed contrastive loss that focuses on similarity than exact replication.
Our approach demonstrates superior performance on CIFAR-100 and ImageNet ILSVRC-2012 and sometimes even outperforms the teacher network when combined with KD.
arXiv Detail & Related papers (2024-07-16T14:56:13Z) - Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching [53.05954114863596]
We propose a brand-new Deep Boosting Learning (DBL) algorithm for image-text matching.
An anchor branch is first trained to provide insights into the data properties.
A target branch is concurrently tasked with more adaptive margin constraints to further enlarge the relative distance between matched and unmatched samples.
arXiv Detail & Related papers (2024-04-28T08:44:28Z) - Robustness-Reinforced Knowledge Distillation with Correlation Distance
and Network Pruning [3.1423836318272773]
Knowledge distillation (KD) improves the performance of efficient and lightweight models.
Most existing KD techniques rely on Kullback-Leibler (KL) divergence.
We propose a Robustness-Reinforced Knowledge Distillation (R2KD) that leverages correlation distance and network pruning.
arXiv Detail & Related papers (2023-11-23T11:34:48Z) - Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs.
We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z) - Dense Depth Distillation with Out-of-Distribution Simulated Images [30.79756881887895]
We study data-free knowledge distillation (KD) for monocular depth estimation (MDE)
KD learns a lightweight model for real-world depth perception tasks by compressing it from a trained teacher model while lacking training data in the target domain.
We show that our method outperforms the baseline KD by a good margin and even slightly better performance with as few as 1/6 of training images.
arXiv Detail & Related papers (2022-08-26T07:10:01Z) - CMD: Self-supervised 3D Action Representation Learning with Cross-modal
Mutual Distillation [130.08432609780374]
In 3D action recognition, there exists rich complementary information between skeleton modalities.
We propose a new Cross-modal Mutual Distillation (CMD) framework with the following designs.
Our approach outperforms existing self-supervised methods and sets a series of new records.
arXiv Detail & Related papers (2022-08-26T06:06:09Z) - Activation to Saliency: Forming High-Quality Labels for Unsupervised
Salient Object Detection [54.92703325989853]
We propose a two-stage Activation-to-Saliency (A2S) framework that effectively generates high-quality saliency cues.
No human annotations are involved in our framework during the whole training process.
Our framework reports significant performance compared with existing USOD methods.
arXiv Detail & Related papers (2021-12-07T11:54:06Z) - Dual Transfer Learning for Event-based End-task Prediction via Pluggable
Event to Image Translation [33.28163268182018]
Event cameras perceive per-pixel intensity changes and output asynchronous event streams with high dynamic range and less motion blur.
It has been shown that events alone can be used for end-task learning, eg, semantic segmentation, based on encoder-decoder-like networks.
We propose a simple yet flexible two-stream framework named Dual Transfer Learning (DTL) to effectively enhance the performance on the end-tasks.
arXiv Detail & Related papers (2021-09-04T06:49:09Z) - Knowledge Distillation Thrives on Data Augmentation [65.58705111863814]
Knowledge distillation (KD) is a general deep neural network training framework that uses a teacher model to guide a student model.
Many works have explored the rationale for its success, however, its interplay with data augmentation (DA) has not been well recognized so far.
In this paper, we are motivated by an interesting observation in classification: KD loss can benefit from extended training iterations while the cross-entropy loss does not.
We show this disparity arises because of data augmentation: KD loss can tap into the extra information from different input views brought by DA.
arXiv Detail & Related papers (2020-12-05T00:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.