CCDepth: A Lightweight Self-supervised Depth Estimation Network with Enhanced Interpretability
- URL: http://arxiv.org/abs/2409.19933v1
- Date: Mon, 30 Sep 2024 04:19:40 GMT
- Title: CCDepth: A Lightweight Self-supervised Depth Estimation Network with Enhanced Interpretability
- Authors: Xi Zhang, Yaru Xue, Shaocheng Jia, Xin Pei,
- Abstract summary: This study proposes a novel hybrid self-supervised depth estimation network, CCDepth, comprising convolutional neural networks (CNNs) and the white-box CRATE network.
This novel network uses CNNs and the CRATE modules to extract local and global information in images, respectively, thereby boosting learning efficiency and reducing model size.
- Score: 11.076431337488973
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised depth estimation, which solely requires monocular image sequence as input, has become increasingly popular and promising in recent years. Current research primarily focuses on enhancing the prediction accuracy of the models. However, the excessive number of parameters impedes the universal deployment of the model on edge devices. Moreover, the emerging neural networks, being black-box models, are difficult to analyze, leading to challenges in understanding the rationales for performance improvements. To mitigate these issues, this study proposes a novel hybrid self-supervised depth estimation network, CCDepth, comprising convolutional neural networks (CNNs) and the white-box CRATE (Coding RAte reduction TransformEr) network. This novel network uses CNNs and the CRATE modules to extract local and global information in images, respectively, thereby boosting learning efficiency and reducing model size. Furthermore, incorporating the CRATE modules into the network enables a mathematically interpretable process in capturing global features. Extensive experiments on the KITTI dataset indicate that the proposed CCDepth network can achieve performance comparable with those state-of-the-art methods, while the model size has been significantly reduced. In addition, a series of quantitative and qualitative analyses on the inner features in the CCDepth network further confirm the effectiveness of the proposed method.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Node Centrality Approximation For Large Networks Based On Inductive
Graph Neural Networks [2.4012886591705738]
Closeness Centrality (CC) and Betweenness Centrality (BC) are crucial metrics in network analysis.
Their practical implementation on extensive networks remains computationally demanding due to their high time complexity.
We propose the CNCA-IGE model, which is an inductive graph encoder-decoder model designed to rank nodes based on specified CC or BC metrics.
arXiv Detail & Related papers (2024-03-08T01:23:12Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Enhancing Rock Image Segmentation in Digital Rock Physics: A Fusion of
Generative AI and State-of-the-Art Neural Networks [5.089732183029123]
In digital rock physics, analysing microstructures from CT and SEM scans is crucial for estimating properties like porosity and pore connectivity.
Traditional segmentation methods like thresholding and CNNs often fall short in accurately detailing rock microstructures and are prone to noise.
U-Net improved segmentation accuracy but required many expert-annotated samples, a laborious and error-prone process due to complex pore shapes.
Our study employed an advanced generative AI model, the diffusion model, to overcome these limitations.
TransU-Net sets a new standard in digital rock physics, paving the way for future geoscience and engineering breakthroughs.
arXiv Detail & Related papers (2023-11-10T14:24:50Z) - A New PHO-rmula for Improved Performance of Semi-Structured Networks [0.0]
We show that techniques to properly identify the contributions of the different model components in SSNs lead to suboptimal network estimation.
We propose a non-invasive post-hocization (PHO) that guarantees identifiability of model components and provides better estimation and prediction quality.
Our theoretical findings are supported by numerical experiments, a benchmark comparison as well as a real-world application to COVID-19 infections.
arXiv Detail & Related papers (2023-06-01T10:23:28Z) - Generalization and Estimation Error Bounds for Model-based Neural
Networks [78.88759757988761]
We show that the generalization abilities of model-based networks for sparse recovery outperform those of regular ReLU networks.
We derive practical design rules that allow to construct model-based networks with guaranteed high generalization.
arXiv Detail & Related papers (2023-04-19T16:39:44Z) - A Generic Shared Attention Mechanism for Various Backbone Neural Networks [53.36677373145012]
Self-attention modules (SAMs) produce strongly correlated attention maps across different layers.
Dense-and-Implicit Attention (DIA) shares SAMs across layers and employs a long short-term memory module.
Our simple yet effective DIA can consistently enhance various network backbones.
arXiv Detail & Related papers (2022-10-27T13:24:08Z) - Ensembles of Spiking Neural Networks [0.3007949058551534]
This paper demonstrates how to construct ensembles of spiking neural networks producing state-of-the-art results.
We achieve classification accuracies of 98.71%, 100.0%, and 99.09%, on the MNIST, NMNIST and DVS Gesture datasets respectively.
We formalize spiking neural networks as GLM predictors, identifying a suitable representation for their target domain.
arXiv Detail & Related papers (2020-10-15T17:45:18Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Iterative Network for Image Super-Resolution [69.07361550998318]
Single image super-resolution (SISR) has been greatly revitalized by the recent development of convolutional neural networks (CNN)
This paper provides a new insight on conventional SISR algorithm, and proposes a substantially different approach relying on the iterative optimization.
A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.
arXiv Detail & Related papers (2020-05-20T11:11:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.