Crowd Counting with Online Knowledge Learning
- URL: http://arxiv.org/abs/2303.10318v1
- Date: Sat, 18 Mar 2023 03:27:57 GMT
- Title: Crowd Counting with Online Knowledge Learning
- Authors: Shengqin Jiang, Bowen Li, Fengna Cheng, Qingshan Liu
- Abstract summary: We propose an online knowledge learning method for crowd counting.
Our method builds an end-to-end training framework that integrates two independent networks into a single architecture.
Our method achieves comparable performance to state-of-the-art methods despite using far fewer parameters.
- Score: 23.602652841154164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Efficient crowd counting models are urgently required for the applications in
scenarios with limited computing resources, such as edge computing and mobile
devices. A straightforward method to achieve this is knowledge distillation
(KD), which involves using a trained teacher network to guide the training of a
student network. However, this traditional two-phase training method can be
time-consuming, particularly for large datasets, and it is also challenging for
the student network to mimic the learning process of the teacher network. To
overcome these challenges, we propose an online knowledge learning method for
crowd counting. Our method builds an end-to-end training framework that
integrates two independent networks into a single architecture, which consists
of a shared shallow module, a teacher branch, and a student branch. This
approach is more efficient than the two-stage training technique of traditional
KD. Moreover, we propose a feature relation distillation method which allows
the student branch to more effectively comprehend the evolution of inter-layer
features by constructing a new inter-layer relationship matrix. It is combined
with response distillation and feature internal distillation to enhance the
transfer of mutually complementary information from the teacher branch to the
student branch. Extensive experiments on four challenging crowd counting
datasets demonstrate the effectiveness of our method which achieves comparable
performance to state-of-the-art methods despite using far fewer parameters.
Related papers
- Ensemble Learning via Knowledge Transfer for CTR Prediction [9.891226177252653]
In this paper, we investigate larger ensemble networks and find three inherent limitations in commonly used ensemble learning method.
We propose a novel model-agnostic Ensemble Knowledge Transfer Framework (EKTF)
Experimental results on five real-world datasets demonstrate the effectiveness and compatibility of EKTF.
arXiv Detail & Related papers (2024-11-25T06:14:20Z) - A Unified Framework for Continual Learning and Machine Unlearning [9.538733681436836]
Continual learning and machine unlearning are crucial challenges in machine learning, typically addressed separately.
We introduce a novel framework that jointly tackles both tasks by leveraging controlled knowledge distillation.
Our approach enables efficient learning with minimal forgetting and effective targeted unlearning.
arXiv Detail & Related papers (2024-08-21T06:49:59Z) - Direct Distillation between Different Domains [97.39470334253163]
We propose a new one-stage method dubbed Direct Distillation between Different Domains" (4Ds)
We first design a learnable adapter based on the Fourier transform to separate the domain-invariant knowledge from the domain-specific knowledge.
We then build a fusion-activation mechanism to transfer the valuable domain-invariant knowledge to the student network.
arXiv Detail & Related papers (2024-01-12T02:48:51Z) - Improving Ensemble Distillation With Weight Averaging and Diversifying
Perturbation [22.87106703794863]
It motivates distilling knowledge from the ensemble teacher into a smaller student network.
We propose a weight averaging technique where a student with multipleworks is trained to absorb the functional diversity of ensemble teachers.
We also propose a perturbation strategy that seeks inputs from which the diversities of teachers can be better transferred to the student.
arXiv Detail & Related papers (2022-06-30T06:23:03Z) - Augmenting Knowledge Distillation With Peer-To-Peer Mutual Learning For
Model Compression [2.538209532048867]
Mutual Learning (ML) provides an alternative strategy where multiple simple student networks benefit from sharing knowledge.
We propose a single-teacher, multi-student framework that leverages both KD and ML to achieve better performance.
arXiv Detail & Related papers (2021-10-21T09:59:31Z) - Collaborative Teacher-Student Learning via Multiple Knowledge Transfer [79.45526596053728]
We propose a collaborative teacher-student learning via multiple knowledge transfer (CTSL-MKT)
It allows multiple students learn knowledge from both individual instances and instance relations in a collaborative way.
The experiments and ablation studies on four image datasets demonstrate that the proposed CTSL-MKT significantly outperforms the state-of-the-art KD methods.
arXiv Detail & Related papers (2021-01-21T07:17:04Z) - Interactive Knowledge Distillation [79.12866404907506]
We propose an InterActive Knowledge Distillation scheme to leverage the interactive teaching strategy for efficient knowledge distillation.
In the distillation process, the interaction between teacher and student networks is implemented by a swapping-in operation.
Experiments with typical settings of teacher-student networks demonstrate that the student networks trained by our IAKD achieve better performance than those trained by conventional knowledge distillation methods.
arXiv Detail & Related papers (2020-07-03T03:22:04Z) - Peer Collaborative Learning for Online Knowledge Distillation [69.29602103582782]
Peer Collaborative Learning method integrates online ensembling and network collaboration into a unified framework.
Experiments on CIFAR-10, CIFAR-100 and ImageNet show that the proposed method significantly improves the generalisation of various backbone networks.
arXiv Detail & Related papers (2020-06-07T13:21:52Z) - Heterogeneous Knowledge Distillation using Information Flow Modeling [82.83891707250926]
We propose a novel KD method that works by modeling the information flow through the various layers of the teacher model.
The proposed method is capable of overcoming the aforementioned limitations by using an appropriate supervision scheme during the different phases of the training process.
arXiv Detail & Related papers (2020-05-02T06:56:56Z) - Efficient Crowd Counting via Structured Knowledge Transfer [122.30417437707759]
Crowd counting is an application-oriented task and its inference efficiency is crucial for real-world applications.
We propose a novel Structured Knowledge Transfer framework to generate a lightweight but still highly effective student network.
Our models obtain at least 6.5$times$ speed-up on an Nvidia 1080 GPU and even achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-03-23T08:05:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.