Related papers: Towards Complementary Knowledge Distillation for Efficient Dense Image Prediction

Towards Complementary Knowledge Distillation for Efficient Dense Image Prediction

URL: http://arxiv.org/abs/2401.13174v3
Date: Thu, 27 Mar 2025 01:07:52 GMT
Title: Towards Complementary Knowledge Distillation for Efficient Dense Image Prediction
Authors: Dong Zhang, Pingcheng Dong, Long Chen, Kwang-Ting Cheng,
Abstract summary: It has been revealed that small efficient dense image prediction (EDIP) models, trained using the knowledge distillation (KD) framework, encounter two key challenges.<n>We propose a complementary boundary and context distillation (BCD) method within the KD framework for EDIPs.<n>Our method can outperform existing methods without requiring extra supervisions or incurring increased inference costs.
Score: 30.975580866705783
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It has been revealed that small efficient dense image prediction (EDIP) models, trained using the knowledge distillation (KD) framework, encounter two key challenges, including maintaining boundary region completeness and preserving target region connectivity, despite their favorable capacity to recognize main object regions. In this work, we propose a complementary boundary and context distillation (BCD) method within the KD framework for EDIPs, which facilitates the targeted knowledge transfer from large accurate teacher models to compact efficient student models. Specifically, the boundary distillation component focuses on extracting explicit object-level semantic boundaries from the hierarchical feature maps of the backbone network to enhance the student model's mask quality in boundary regions. Concurrently, the context distillation component leverages self-relations as a bridge to transfer implicit pixel-level contexts from the teacher model to the student model, ensuring strong connectivity in target regions. Our proposed BCD method is specifically designed for EDIP tasks and is characterized by its simplicity and efficiency. Extensive experimental results across semantic segmentation, object detection, and instance segmentation on various representative datasets demonstrate that our method can outperform existing methods without requiring extra supervisions or incurring increased inference costs, resulting in well-defined object boundaries and smooth connecting regions.

Related papers

Discrete Markov Bridge [93.64996843697278]
We propose a novel framework specifically designed for discrete representation learning, called Discrete Markov Bridge.<n>Our approach is built upon two key components: Matrix Learning and Score Learning.
arXiv Detail & Related papers (2025-05-26T09:32:12Z)
BoundMatch: Boundary detection applied to semi-supervised segmentation for urban-driving scenes [6.236890292833387]
Semi-supervised semantic segmentation (SS-SS) aims to mitigate the heavy annotation burden of dense pixel labeling. We propose BoundMatch, a novel multi-task SS-SS framework that integrates semantic boundary detection into the consistency regularization pipeline. Our core mechanism, Boundary Consistency Regularized Multi-Task Learning, enforces prediction agreement between teacher and student models.
arXiv Detail & Related papers (2025-03-30T17:02:26Z)
A Deep Learning Framework for Boundary-Aware Semantic Segmentation [9.680285420002516]
This study proposes a Mask2Former-based semantic segmentation algorithm incorporating a boundary enhancement feature bridging module (BEFBM) The proposed approach achieves significant improvements in metrics such as mIOU, mDICE, and mRecall. Visual analysis confirms the model's advantages in fine-grained regions.
arXiv Detail & Related papers (2025-03-28T00:00:08Z)
Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching [53.05954114863596]
We propose a brand-new Deep Boosting Learning (DBL) algorithm for image-text matching. An anchor branch is first trained to provide insights into the data properties. A target branch is concurrently tasked with more adaptive margin constraints to further enlarge the relative distance between matched and unmatched samples.
arXiv Detail & Related papers (2024-04-28T08:44:28Z)
Attention-guided Feature Distillation for Semantic Segmentation [8.344263189293578]
This paper showcases the efficacy of a simple yet powerful method for utilizing refined feature maps to transfer attention. The proposed Attention-guided Feature Distillation (AttnFD) method, employs the Convolutional Block Attention Module (CBAM) It achieves state-of-the-art results in terms of improving the mean Intersection over Union (mIoU) of the student network on the PascalVoc 2012, Cityscapes, COCO, and CamVid datasets.
arXiv Detail & Related papers (2024-03-08T16:57:47Z)
Optimization Efficient Open-World Visual Region Recognition [55.76437190434433]
RegionSpot integrates position-aware localization knowledge from a localization foundation model with semantic information from a ViL model. Experiments in open-world object recognition show that our RegionSpot achieves significant performance gain over prior alternatives.
arXiv Detail & Related papers (2023-11-02T16:31:49Z)
Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation [84.62067728093358]
Weakly supervised object localization and semantic segmentation aim to localize objects using only image-level labels. New paradigm has emerged by generating a foreground prediction map to achieve pixel-level localization. This paper presents two astonishing experimental observations on the object localization learning process.
arXiv Detail & Related papers (2023-09-22T15:44:10Z)
X-PDNet: Accurate Joint Plane Instance Segmentation and Monocular Depth Estimation with Cross-Task Distillation and Boundary Correction [9.215384107659665]
X-PDNet is a framework for the multitask learning of plane instance segmentation and depth estimation. We highlight the current limitations of using the ground truth boundary to develop boundary regression loss. We propose a novel method that exploits depth information to support precise boundary region segmentation.
arXiv Detail & Related papers (2023-09-15T14:27:54Z)
Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner. We design a semantic-guided self-supervised learning model to extract high-level semantic features from images. We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z)
BPKD: Boundary Privileged Knowledge Distillation For Semantic Segmentation [20.450568708073767]
This paper proposes boundary-privileged knowledge distillation (BPKD) for semantic segmentation. BPKD distills the knowledge of the teacher model's body and edges separately to the compact student model. Our experiments demonstrate that the proposed BPKD method provides extensive refinements and aggregation for edge and body regions.
arXiv Detail & Related papers (2023-06-13T18:45:32Z)
Impact of a DCT-driven Loss in Attention-based Knowledge-Distillation for Scene Recognition [64.29650787243443]
We propose and analyse the use of a 2D frequency transform of the activation maps before transferring them. This strategy enhances knowledge transferability in tasks such as scene recognition. We publicly release the training and evaluation framework used along this paper at http://www.vpu.eps.uam.es/publications/DCTBasedKDForSceneRecognition.
arXiv Detail & Related papers (2022-05-04T11:05:18Z)
Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection. Our approach performs contrastive learning by directly sampling individual point pairs from different regions. Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z)
Contrastive Neighborhood Alignment [81.65103777329874]
We present Contrastive Neighborhood Alignment (CNA), a manifold learning approach to maintain the topology of learned features. The target model aims to mimic the local structure of the source representation space using a contrastive loss. CNA is illustrated in three scenarios: manifold learning, where the model maintains the local topology of the original data in a dimension-reduced space; model distillation, where a small student model is trained to mimic a larger teacher; and legacy model update, where an older model is replaced by a more powerful one.
arXiv Detail & Related papers (2022-01-06T04:58:31Z)
Weakly Supervised Semantic Segmentation via Alternative Self-Dual Teaching [82.71578668091914]
This paper establishes a compact learning framework that embeds the classification and mask-refinement components into a unified deep model. We propose a novel alternative self-dual teaching (ASDT) mechanism to encourage high-quality knowledge interaction.
arXiv Detail & Related papers (2021-12-17T11:56:56Z)
Boundary Guided Context Aggregation for Semantic Segmentation [23.709865471981313]
We exploit boundary as a significant guidance for context aggregation to promote the overall semantic understanding of an image. We conduct extensive experiments on the Cityscapes and ADE20K databases, and comparable results are achieved with the state-of-the-art methods.
arXiv Detail & Related papers (2021-10-27T17:04:38Z)
Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation [51.59290734837372]
We propose a conceptually simple yet effective post-processing refinement framework to improve the boundary quality. The proposed BPR framework yields significant improvements over the Mask R-CNN baseline on Cityscapes benchmark. By applying the BPR framework to the PolyTransform + SegFix baseline, we reached 1st place on the Cityscapes leaderboard.
arXiv Detail & Related papers (2021-04-12T07:10:48Z)
Think about boundary: Fusing multi-level boundary information for landmark heatmap regression [51.48533538153833]
We study a two-stage but end-to-end approach for exploring the relationship between the facial boundary and landmarks. We get boundary-aware landmark predictions, which consists of two modules: the self-calibrated boundary estimation (SCBE) module and the boundary-aware landmark transform (BALT) module. Our approach outperforms state-of-the-art methods in the literature.
arXiv Detail & Related papers (2020-08-25T10:14:13Z)
Deep Complementary Joint Model for Complex Scene Registration and Few-shot Segmentation on Medical Images [15.958078577731815]
We propose a novel Deep Complementary Joint Model (DeepRS) for complex scene registration and few-shot segmentation. We embed a perturbation factor in the registration to increase the activity of deformation thus maintaining the augmentation data diversity. The outputs from segmentation model are utilized to implement deep-based region constraints thus relieving the label requirements and bringing fine registration.
arXiv Detail & Related papers (2020-08-03T08:25:59Z)
Inter-Region Affinity Distillation for Road Marking Segmentation [81.3619453527367]
We study the problem of distilling knowledge from a large deep teacher network to a much smaller student network. Our method is known as Inter-Region Affinity KD (IntRA-KD)
arXiv Detail & Related papers (2020-04-11T04:26:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.