LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual
Semantic Segmentation for Autonomous Driving
- URL: http://arxiv.org/abs/2403.08215v1
- Date: Wed, 13 Mar 2024 03:24:36 GMT
- Title: LIX: Implicitly Infusing Spatial Geometric Prior Knowledge into Visual
Semantic Segmentation for Autonomous Driving
- Authors: Sicen Guo, Zhiyuan Wu, Qijun Chen, Ioannis Pitas and Rui Fan
- Abstract summary: We introduce the Learning to Infuse "X" (LIX) framework, with novel contributions in both logit distillation and feature distillation aspects.
We develop an adaptively-recalibrated feature distillation algorithm, including two technical novelties.
- Score: 26.319489913682574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the impressive performance achieved by data-fusion networks with
duplex encoders for visual semantic segmentation, they become ineffective when
spatial geometric data are not available. Implicitly infusing the spatial
geometric prior knowledge acquired by a duplex-encoder teacher model into a
single-encoder student model is a practical, albeit less explored research
avenue. This paper delves into this topic and resorts to knowledge distillation
approaches to address this problem. We introduce the Learning to Infuse "X"
(LIX) framework, with novel contributions in both logit distillation and
feature distillation aspects. We present a mathematical proof that underscores
the limitation of using a single fixed weight in decoupled knowledge
distillation and introduce a logit-wise dynamic weight controller as a solution
to this issue. Furthermore, we develop an adaptively-recalibrated feature
distillation algorithm, including two technical novelties: feature
recalibration via kernel regression and in-depth feature consistency
quantification via centered kernel alignment. Extensive experiments conducted
with intermediate-fusion and late-fusion networks across various public
datasets provide both quantitative and qualitative evaluations, demonstrating
the superior performance of our LIX framework when compared to other
state-of-the-art approaches.
Related papers
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - Fully Differentiable Correlation-driven 2D/3D Registration for X-ray to CT Image Fusion [3.868072865207522]
Image-based rigid 2D/3D registration is a critical technique for fluoroscopic guided surgical interventions.
We propose a novel fully differentiable correlation-driven network using a dual-branch CNN-transformer encoder.
A correlation-driven loss is proposed for low-frequency feature and high-frequency feature decomposition based on embedded information.
arXiv Detail & Related papers (2024-02-04T14:12:51Z) - Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z) - Generative Model-based Feature Knowledge Distillation for Action
Recognition [11.31068233536815]
Our paper introduces an innovative knowledge distillation framework, with the generative model for training a lightweight student model.
The efficacy of our approach is demonstrated through comprehensive experiments on diverse popular datasets.
arXiv Detail & Related papers (2023-12-14T03:55:29Z) - DUCK: Distance-based Unlearning via Centroid Kinematics [40.2428948628001]
This work introduces a novel unlearning algorithm, denoted as Distance-based Unlearning via Centroid Kinematics (DUCK)
evaluation of the algorithm's performance is conducted across various benchmark datasets.
We also introduce a novel metric, called Adaptive Unlearning Score (AUS), encompassing not only the efficacy of the unlearning process in forgetting target data but also quantifying the performance loss relative to the original model.
arXiv Detail & Related papers (2023-12-04T17:10:25Z) - Can Decentralized Stochastic Minimax Optimization Algorithms Converge
Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning.
This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z) - EmbedDistill: A Geometric Knowledge Distillation for Information
Retrieval [83.79667141681418]
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR)
We propose a novel distillation approach that leverages the relative geometry among queries and documents learned by the large teacher model.
We show that our approach successfully distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to 1/10th size asymmetric students that can retain 95-97% of the teacher performance.
arXiv Detail & Related papers (2023-01-27T22:04:37Z) - Point-to-Voxel Knowledge Distillation for LiDAR Semantic Segmentation [74.67594286008317]
This article addresses the problem of distilling knowledge from a large teacher model to a slim student network for LiDAR semantic segmentation.
We propose the Point-to-Voxel Knowledge Distillation (PVD), which transfers the hidden knowledge from both point level and voxel level.
arXiv Detail & Related papers (2022-06-05T05:28:32Z) - Bayesian Low-rank Matrix Completion with Dual-graph Embedding: Prior
Analysis and Tuning-free Inference [16.82986562533071]
We propose a novel Bayesian learning algorithm that automatically learns the hyper- parameters associated with dual-graph regularization.
A novel prior is devised to promote the low-rankness of the matrix and encode the dual-graph information simultaneously.
Experiments using synthetic and real-world datasets demonstrate the state-of-the-art performance of the proposed learning algorithm.
arXiv Detail & Related papers (2022-03-18T16:38:30Z) - Extendable and invertible manifold learning with geometry regularized
autoencoders [9.742277703732187]
A fundamental task in data exploration is to extract simplified low dimensional representations that capture intrinsic geometry in data.
Common approaches to this task use kernel methods for manifold learning.
We present a new method for integrating both approaches by incorporating a geometric regularization term in the bottleneck of the autoencoder.
arXiv Detail & Related papers (2020-07-14T15:59:10Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.