Hilbert Distillation for Cross-Dimensionality Networks
- URL: http://arxiv.org/abs/2211.04031v1
- Date: Tue, 8 Nov 2022 06:25:06 GMT
- Title: Hilbert Distillation for Cross-Dimensionality Networks
- Authors: Dian Qin, Haishuai Wang, Zhe Liu, Hongjia Xu, Sheng Zhou, Jiajun Bu
- Abstract summary: 3D convolutional neural networks have revealed superior performance in processing data such as video and medical imaging.
However, the competitive performance by leveraging 3D networks results in huge computational costs.
We propose a novel Hilbert curve-based cross-dimensionality distillation approach to improve the performance of 2D networks.
- Score: 23.700464344728424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D convolutional neural networks have revealed superior performance in
processing volumetric data such as video and medical imaging. However, the
competitive performance by leveraging 3D networks results in huge computational
costs, which are far beyond that of 2D networks. In this paper, we propose a
novel Hilbert curve-based cross-dimensionality distillation approach that
facilitates the knowledge of 3D networks to improve the performance of 2D
networks. The proposed Hilbert Distillation (HD) method preserves the
structural information via the Hilbert curve, which maps high-dimensional (>=2)
representations to one-dimensional continuous space-filling curves. Since the
distilled 2D networks are supervised by the curves converted from dimensionally
heterogeneous 3D features, the 2D networks are given an informative view in
terms of learning structural information embedded in well-trained
high-dimensional representations. We further propose a Variable-length Hilbert
Distillation (VHD) method to dynamically shorten the walking stride of the
Hilbert curve in activation feature areas and lengthen the stride in context
feature areas, forcing the 2D networks to pay more attention to learning from
activation features. The proposed algorithm outperforms the current
state-of-the-art distillation techniques adapted to cross-dimensionality
distillation on two classification tasks. Moreover, the distilled 2D networks
by the proposed method achieve competitive performance with the original 3D
networks, indicating the lightweight distilled 2D networks could potentially be
the substitution of cumbersome 3D networks in the real-world scenario.
Related papers
- DeCoTR: Enhancing Depth Completion with 2D and 3D Attentions [41.55908366474901]
We introduce a novel approach that harnesses both 2D and 3D attentions to enable highly accurate depth completion.
We evaluate our method, DeCoTR, on established depth completion benchmarks.
arXiv Detail & Related papers (2024-03-18T19:22:55Z) - Dynamic 3D Point Cloud Sequences as 2D Videos [81.46246338686478]
3D point cloud sequences serve as one of the most common and practical representation modalities of real-world environments.
We propose a novel generic representation called textitStructured Point Cloud Videos (SPCVs)
SPCVs re-organizes a point cloud sequence as a 2D video with spatial smoothness and temporal consistency, where the pixel values correspond to the 3D coordinates of points.
arXiv Detail & Related papers (2024-03-02T08:18:57Z) - Simultaneous Alignment and Surface Regression Using Hybrid 2D-3D
Networks for 3D Coherent Layer Segmentation of Retinal OCT Images with Full
and Sparse Annotations [32.69359482975795]
This work presents a novel framework based on hybrid 2D-3D convolutional neural networks (CNNs) to obtain continuous 3D retinal layer surfaces from OCT volumes.
Experiments on a synthetic dataset and three public clinical datasets show that our framework can effectively align the B-scans for potential motion correction.
arXiv Detail & Related papers (2023-12-04T08:32:31Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - Multi-View Representation is What You Need for Point-Cloud Pre-Training [22.55455166875263]
This paper proposes a novel approach to point-cloud pre-training that learns 3D representations by leveraging pre-trained 2D networks.
We train the 3D feature extraction network with the help of the novel 2D knowledge transfer loss.
Experimental results demonstrate that our pre-trained model can be successfully transferred to various downstream tasks.
arXiv Detail & Related papers (2023-06-05T03:14:54Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based
Perception [122.53774221136193]
State-of-the-art methods for driving-scene LiDAR-based perception often project the point clouds to 2D space and then process them via 2D convolution.
A natural remedy is to utilize the 3D voxelization and 3D convolution network.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pattern.
arXiv Detail & Related papers (2021-09-12T06:25:11Z) - Feature Disentanglement in generating three-dimensional structure from
two-dimensional slice with sliceGAN [35.3148116010546]
sliceGAN proposed a new way of using the generative adversarial network (GAN) to capture the micro-structural characteristics of a two-dimensional (2D) slice.
We combine sliceGAN with AdaIN to endow the model with the ability to disentangle the features and control the synthesis.
arXiv Detail & Related papers (2021-05-01T08:29:33Z) - 3D-to-2D Distillation for Indoor Scene Parsing [78.36781565047656]
We present a new approach that enables us to leverage 3D features extracted from large-scale 3D data repository to enhance 2D features extracted from RGB images.
First, we distill 3D knowledge from a pretrained 3D network to supervise a 2D network to learn simulated 3D features from 2D features during the training.
Second, we design a two-stage dimension normalization scheme to calibrate the 2D and 3D features for better integration.
Third, we design a semantic-aware adversarial training model to extend our framework for training with unpaired 3D data.
arXiv Detail & Related papers (2021-04-06T02:22:24Z) - Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR
Segmentation [81.02742110604161]
State-of-the-art methods for large-scale driving-scene LiDAR segmentation often project the point clouds to 2D space and then process them via 2D convolution.
We propose a new framework for the outdoor LiDAR segmentation, where cylindrical partition and asymmetrical 3D convolution networks are designed to explore the 3D geometric pat-tern.
Our method achieves the 1st place in the leaderboard of Semantic KITTI and outperforms existing methods on nuScenes with a noticeable margin, about 4%.
arXiv Detail & Related papers (2020-11-19T18:53:11Z) - Cylinder3D: An Effective 3D Framework for Driving-scene LiDAR Semantic
Segmentation [87.54570024320354]
State-of-the-art methods for large-scale driving-scene LiDAR semantic segmentation often project and process the point clouds in the 2D space.
A straightforward solution to tackle the issue of 3D-to-2D projection is to keep the 3D representation and process the points in the 3D space.
We develop a 3D cylinder partition and a 3D cylinder convolution based framework, termed as Cylinder3D, which exploits the 3D topology relations and structures of driving-scene point clouds.
arXiv Detail & Related papers (2020-08-04T13:56:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.