Related papers: Semantic Knowledge Distillation for Onboard Satellite Earth Observation Image Classification

Semantic Knowledge Distillation for Onboard Satellite Earth Observation Image Classification

URL: http://arxiv.org/abs/2411.00209v1
Date: Thu, 31 Oct 2024 21:13:40 GMT
Title: Semantic Knowledge Distillation for Onboard Satellite Earth Observation Image Classification
Authors: Thanh-Dung Le, Vu Nguyen Ha, Ti Ti Nguyen, Geoffrey Eappen, Prabhu Thiruvasagam, Hong-fu Chou, Duc-Dung Tran, Luis M. Garces-Socarras, Jorge L. Gonzalez-Rios, Juan Carlos Merlano-Duncan, Symeon Chatzinotas,
Abstract summary: This study presents an innovative dynamic weighting knowledge distillation (KD) framework tailored for efficient Earth observation (EO) image classification (IC) in resource-constrained settings. Our framework enables lightweight student models to surpass 90% in accuracy, precision, and recall, adhering to the stringent confidence thresholds necessary for reliable classification tasks. Remarkably, ResNet8 delivers substantial efficiency gains, achieving a 97.5% reduction in parameters, a 96.7% decrease in FLOPs, an 86.2% cut in power consumption, and a 63.5% increase in inference speed over MobileViT.
Score: 28.08042498882207
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This study presents an innovative dynamic weighting knowledge distillation (KD) framework tailored for efficient Earth observation (EO) image classification (IC) in resource-constrained settings. Utilizing EfficientViT and MobileViT as teacher models, this framework enables lightweight student models, particularly ResNet8 and ResNet16, to surpass 90% in accuracy, precision, and recall, adhering to the stringent confidence thresholds necessary for reliable classification tasks. Unlike conventional KD methods that rely on static weight distribution, our adaptive weighting mechanism responds to each teacher model's confidence, allowing student models to prioritize more credible sources of knowledge dynamically. Remarkably, ResNet8 delivers substantial efficiency gains, achieving a 97.5% reduction in parameters, a 96.7% decrease in FLOPs, an 86.2% cut in power consumption, and a 63.5% increase in inference speed over MobileViT. This significant optimization of complexity and resource demands establishes ResNet8 as an optimal candidate for EO tasks, combining robust performance with feasibility in deployment. The confidence-based, adaptable KD approach underscores the potential of dynamic distillation strategies to yield high-performing, resource-efficient models tailored for satellite-based EO applications. The reproducible code is accessible on our GitHub repository.

Related papers

Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients [0.0]
This paper proposes a machine learning framework that augments Knowledge Distillation (KD) with Integrated Gradients (IG) We introduce a novel data augmentation strategy where IG maps, precomputed from a teacher model, are overlaid onto training images to guide a compact student model toward critical feature representations. Experiments on CIFAR-10 demonstrate the efficacy of our method: a student model, compressed 4.1-fold from the MobileNet-V2 teacher, achieves 92.5% classification accuracy, surpassing the baseline student's 91.4% and traditional KD approaches, while reducing inference latency from 140 ms to 13 ms--a tenfold
arXiv Detail & Related papers (2025-03-17T10:07:50Z)
Meta-Computing Enhanced Federated Learning in IIoT: Satisfaction-Aware Incentive Scheme via DRL-Based Stackelberg Game [50.6166553799783]
Efficient IIoT operations require a trade-off between model quality and training latency. This paper designs a satisfaction function that accounts for data size, Age of Information (AoI), and training latency for meta-computing. We employ a deep reinforcement learning approach to learn the Stackelberg equilibrium.
arXiv Detail & Related papers (2025-02-10T03:33:36Z)
INTACT: Inducing Noise Tolerance through Adversarial Curriculum Training for LiDAR-based Safety-Critical Perception and Autonomy [0.4124847249415279]
We present a novel framework designed to enhance the robustness of deep neural networks (DNNs) against noisy LiDAR data. IntACT combines meta-learning with adversarial curriculum training (ACT) to address challenges posed by data corruption and sparsity in 3D point clouds. IntACT's effectiveness is demonstrated through comprehensive evaluations on object detection, tracking, and classification benchmarks.
arXiv Detail & Related papers (2025-02-04T00:02:16Z)
Efficient Multi-Task Inferencing with a Shared Backbone and Lightweight Task-Specific Adapters for Automatic Scoring [1.2556373621040728]
This paper proposes a shared backbone model architecture enhanced with lightweight LoRA adapters for task-specific fine-tuning. It targets the automated scoring of student responses across 27 mutually exclusive tasks.
arXiv Detail & Related papers (2024-12-30T16:34:11Z)
Self-Data Distillation for Recovering Quality in Pruned Large Language Models [1.5665059604715017]
One-shot pruning results in significant quality degradation, particularly in tasks requiring multi-step reasoning. To recover lost quality, supervised fine-tuning (SFT) is commonly applied, but it can lead to catastrophic forgetting. In this work, we utilize self-data distilled fine-tuning to address these challenges.
arXiv Detail & Related papers (2024-10-13T19:53:40Z)
GKT: A Novel Guidance-Based Knowledge Transfer Framework For Efficient Cloud-edge Collaboration LLM Deployment [74.40196814292426]
We introduce a novel and intuitive Guidance-based Knowledge Transfer (GKT) framework. GKT uses a larger Large Language Models as a ''teacher'' to create guidance prompts, paired with a smaller ''student'' model to finalize responses. It achieves a maximum accuracy improvement of 14.18%, along with a 10.72 times speed-up on GSM8K and an accuracy improvement of 14.00 % along with a 7.73 times speed-up in CSQA.
arXiv Detail & Related papers (2024-05-30T02:37:35Z)
Enhancing Ship Classification in Optical Satellite Imagery: Integrating Convolutional Block Attention Module with ResNet for Improved Performance [1.4659076103416173]
We present an advanced convolutional neural network (CNN) architecture for ship classification based on optical satellite imagery. We first incorporated a standard CBAM to direct the model's focus toward more informative features, achieving an accuracy of 87%. This model demonstrated a remarkable accuracy of 95%, with precision, recall, and F1 scores all witnessing substantial improvements across various ship classes.
arXiv Detail & Related papers (2024-04-02T17:48:46Z)
Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation [56.053397775016755]
We propose a sequential approach to knowledge distillation that progressively transfers the knowledge of a set of teacher detectors to a given lightweight student. To the best of our knowledge, we are the first to successfully distill knowledge from Transformer-based teacher detectors to convolution-based students.
arXiv Detail & Related papers (2023-08-17T17:17:08Z)
Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval [49.01637233471453]
Previous Knowledge Distillation based efficient image retrieval methods employs a lightweight network as the student model for fast inference. We propose a Capacity Dynamic Distillation framework, which constructs a student model with editable representation capacity. Our method has superior inference speed and accuracy, e.g., on the VeRi-776 dataset, given the ResNet101 as a teacher.
arXiv Detail & Related papers (2023-03-16T11:09:22Z)
Estimating and Maximizing Mutual Information for Knowledge Distillation [24.254198219979667]
We propose Mutual Information Maximization Knowledge Distillation (MIMKD) Our method uses a contrastive objective to simultaneously estimate and maximize a lower bound on the mutual information of local and global feature representations between a teacher and a student network. This can be used to improve the performance of low capacity models by transferring knowledge from more performant but computationally expensive models.
arXiv Detail & Related papers (2021-10-29T17:49:56Z)
How and When Adversarial Robustness Transfers in Knowledge Distillation? [137.11016173468457]
This paper studies how and when the adversarial robustness can be transferred from a teacher model to a student model in Knowledge distillation (KD) We show that standard KD training fails to preserve adversarial robustness, and we propose KD with input gradient alignment (KDIGA) for remedy. Under certain assumptions, we prove that the student model using our proposed KDIGA can achieve at least the same certified robustness as the teacher model.
arXiv Detail & Related papers (2021-10-22T21:30:53Z)
Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector. Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z)
Adversarial Concurrent Training: Optimizing Robustness and Accuracy Trade-off of Deep Neural Networks [13.041607703862724]
We propose Adversarial Concurrent Training (ACT) to train a robust model in conjunction with a natural model in a minimax game. ACT achieves 68.20% standard accuracy and 44.29% robustness accuracy under a 100-iteration untargeted attack.
arXiv Detail & Related papers (2020-08-16T22:14:48Z)
Towards Practical Lipreading with Distilled and Efficient Models [57.41253104365274]
Lipreading has witnessed a lot of progress due to the resurgence of neural networks. Recent works have placed emphasis on aspects such as improving performance by finding the optimal architecture or improving generalization. There is still a significant gap between the current methodologies and the requirements for an effective deployment of lipreading in practical scenarios. We propose a series of innovations that significantly bridge that gap: first, we raise the state-of-the-art performance by a wide margin on LRW and LRW-1000 to 88.5% and 46.6%, respectively using self-distillation.
arXiv Detail & Related papers (2020-07-13T16:56:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.