A Comparative Study of Knowledge Transfer Methods for Misaligned Urban
Building Labels
- URL: http://arxiv.org/abs/2311.03867v1
- Date: Tue, 7 Nov 2023 10:31:41 GMT
- Title: A Comparative Study of Knowledge Transfer Methods for Misaligned Urban
Building Labels
- Authors: Bipul Neupane, Jagannath Aryal, Abbas Rajabifard
- Abstract summary: Misalignment in Earth observation (EO) images and building labels impact the training of accurate convolutional neural networks (CNNs) for semantic segmentation of building footprints.
Recently, three Teacher-Student knowledge transfer methods have been introduced to address this issue.
We present a workflow for the systematic comparative study of the three methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Misalignment in Earth observation (EO) images and building labels impact the
training of accurate convolutional neural networks (CNNs) for semantic
segmentation of building footprints. Recently, three Teacher-Student knowledge
transfer methods have been introduced to address this issue: supervised domain
adaptation (SDA), knowledge distillation (KD), and deep mutual learning (DML).
However, these methods are merely studied for different urban buildings
(low-rise, mid-rise, high-rise, and skyscrapers), where misalignment increases
with building height and spatial resolution. In this study, we present a
workflow for the systematic comparative study of the three methods. The
workflow first identifies the best (with the highest evaluation scores)
hyperparameters, lightweight CNNs for the Student (among 43 CNNs from Computer
Vision), and encoder-decoder networks (EDNs) for both Teachers and Students.
Secondly, three building footprint datasets are developed to train and evaluate
the identified Teachers and Students in the three transfer methods. The results
show that U-Net with VGG19 (U-VGG19) is the best Teacher, and
U-EfficientNetv2B3 and U-EfficientNet-lite0 are among the best Students. With
these Teacher-Student pairs, SDA could yield upto 0.943, 0.868, 0.912, and
0.697 F1 scores in the low-rise, mid-rise, high-rise, and skyscrapers
respectively. KD and DML provide model compression of upto 82%, despite
marginal loss in performance. This new comparison concludes that SDA is the
most effective method to address the misalignment problem, while KD and DML can
efficiently compress network size without significant loss in performance. The
158 experiments and datasets developed in this study will be valuable to
minimise the misaligned labels.
Related papers
- OpenCodeReasoning: Advancing Data Distillation for Competitive Coding [61.15402517835137]
We build a supervised fine-tuning (SFT) dataset to achieve state-of-the-art coding capability results in models of various sizes.
Our models use only SFT to achieve 61.8% on LiveCodeBench and 24.6% on CodeContests, surpassing alternatives trained with reinforcement learning.
arXiv Detail & Related papers (2025-04-02T17:50:31Z) - Relative Difficulty Distillation for Semantic Segmentation [54.76143187709987]
We propose a pixel-level KD paradigm for semantic segmentation named Relative Difficulty Distillation (RDD)
RDD allows the teacher network to provide effective guidance on learning focus without additional optimization goals.
Our research showcases that RDD can integrate with existing KD methods to improve their upper performance bound.
arXiv Detail & Related papers (2024-07-04T08:08:25Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient
Semantic Segmentation [16.957139277317005]
Augmentation-free Dense Contrastive Knowledge Distillation (Af-DCD) is a new contrastive distillation learning paradigm.
Af-DCD trains compact and accurate deep neural networks for semantic segmentation applications.
arXiv Detail & Related papers (2023-12-07T09:37:28Z) - Feature-domain Adaptive Contrastive Distillation for Efficient Single
Image Super-Resolution [3.2453621806729234]
CNN-based SISR has numerous parameters and high computational cost to achieve better performance.
Knowledge Distillation (KD) transfers teacher's useful knowledge to student.
We propose a feature-domain adaptive contrastive distillation (FACD) method for efficiently training lightweight student SISR networks.
arXiv Detail & Related papers (2022-11-29T06:24:14Z) - SmoothNets: Optimizing CNN architecture design for differentially
private deep learning [69.10072367807095]
DPSGD requires clipping and noising of per-sample gradients.
This introduces a reduction in model utility compared to non-private training.
We distilled a new model architecture termed SmoothNet, which is characterised by increased robustness to the challenges of DP-SGD training.
arXiv Detail & Related papers (2022-05-09T07:51:54Z) - Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones.
We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z) - Improved Aggregating and Accelerating Training Methods for Spatial Graph
Neural Networks on Fraud Detection [0.0]
This work proposes an improved deep architecture to extend CAmouflage-REsistant GNN (CARE-GNN) to deep models named as Residual Layered CARE-GNN (RLC-GNN)
Three issues of RLC-GNN are the usage of neighboring information reaching limitation, the training difficulty and lack of comprehensive consideration about node features and external patterns.
Experiments are conducted on Yelp and Amazon datasets.
arXiv Detail & Related papers (2022-02-14T09:51:35Z) - Spirit Distillation: Precise Real-time Prediction with Insufficient Data [4.6247655021017655]
We propose a new training framework named Spirit Distillation(SD)
It extends the ideas of fine-tuning-based transfer learning(FTT) and feature-based knowledge distillation.
Results demonstrate the boosting performance in segmentation(mIOU) and high-precision accuracy boost by 1.4% and 8.2% respectively.
arXiv Detail & Related papers (2021-03-25T10:23:30Z) - Deep Time Delay Neural Network for Speech Enhancement with Full Data
Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z) - Kernel Based Progressive Distillation for Adder Neural Networks [71.731127378807]
Adder Neural Networks (ANNs) which only contain additions bring us a new way of developing deep neural networks with low energy consumption.
There is an accuracy drop when replacing all convolution filters by adder filters.
We present a novel method for further improving the performance of ANNs without increasing the trainable parameters.
arXiv Detail & Related papers (2020-09-28T03:29:19Z) - A Unified Plug-and-Play Framework for Effective Data Denoising and
Robust Abstention [4.200576272300216]
We propose a unified filtering framework leveraging underlying data density.
Our framework can effectively denoising training data and avoid predicting uncertain test data points.
arXiv Detail & Related papers (2020-09-25T04:18:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.