Revisiting Sparse Convolutional Model for Visual Recognition
- URL: http://arxiv.org/abs/2210.12945v1
- Date: Mon, 24 Oct 2022 04:29:21 GMT
- Title: Revisiting Sparse Convolutional Model for Visual Recognition
- Authors: Xili Dai, Mingyang Li, Pengyuan Zhai, Shengbang Tong, Xingjian Gao,
Shao-Lun Huang, Zhihui Zhu, Chong You, Yi Ma
- Abstract summary: This paper revisits the sparse convolutional modeling for image classification.
We show that such models have equally strong empirical performance on CIFAR-10, CIFAR-100, and ImageNet datasets.
- Score: 40.726494290922204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite strong empirical performance for image classification, deep neural
networks are often regarded as ``black boxes'' and they are difficult to
interpret. On the other hand, sparse convolutional models, which assume that a
signal can be expressed by a linear combination of a few elements from a
convolutional dictionary, are powerful tools for analyzing natural images with
good theoretical interpretability and biological plausibility. However, such
principled models have not demonstrated competitive performance when compared
with empirically designed deep networks. This paper revisits the sparse
convolutional modeling for image classification and bridges the gap between
good empirical performance (of deep learning) and good interpretability (of
sparse convolutional models). Our method uses differentiable optimization
layers that are defined from convolutional sparse coding as drop-in
replacements of standard convolutional layers in conventional deep neural
networks. We show that such models have equally strong empirical performance on
CIFAR-10, CIFAR-100, and ImageNet datasets when compared to conventional neural
networks. By leveraging stable recovery property of sparse modeling, we further
show that such models can be much more robust to input corruptions as well as
adversarial perturbations in testing through a simple proper trade-off between
sparse regularization and data reconstruction terms. Source code can be found
at https://github.com/Delay-Xili/SDNet.
Related papers
- Transformer-based Clipped Contrastive Quantization Learning for
Unsupervised Image Retrieval [15.982022297570108]
Unsupervised image retrieval aims to learn the important visual characteristics without any given level to retrieve the similar images for a given query image.
In this paper, we propose a TransClippedCLR model by encoding the global context of an image using Transformer having local context through patch based processing.
Results using the proposed clipped contrastive learning are greatly improved on all datasets as compared to same backbone network with vanilla contrastive learning.
arXiv Detail & Related papers (2024-01-27T09:39:11Z) - Layer-wise Linear Mode Connectivity [52.6945036534469]
Averaging neural network parameters is an intuitive method for the knowledge of two independent models.
It is most prominently used in federated learning.
We analyse the performance of the models that result from averaging single, or groups.
arXiv Detail & Related papers (2023-07-13T09:39:10Z) - Traditional Classification Neural Networks are Good Generators: They are
Competitive with DDPMs and GANs [104.72108627191041]
We show that conventional neural network classifiers can generate high-quality images comparable to state-of-the-art generative models.
We propose a mask-based reconstruction module to make semantic gradients-aware to synthesize plausible images.
We show that our method is also applicable to text-to-image generation by regarding image-text foundation models.
arXiv Detail & Related papers (2022-11-27T11:25:35Z) - Towards Practical Control of Singular Values of Convolutional Layers [65.25070864775793]
Convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control.
Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties.
We offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity.
arXiv Detail & Related papers (2022-11-24T19:09:44Z) - How Well Do Sparse Imagenet Models Transfer? [75.98123173154605]
Transfer learning is a classic paradigm by which models pretrained on large "upstream" datasets are adapted to yield good results on "downstream" datasets.
In this work, we perform an in-depth investigation of this phenomenon in the context of convolutional neural networks (CNNs) trained on the ImageNet dataset.
We show that sparse models can match or even outperform the transfer performance of dense models, even at high sparsities.
arXiv Detail & Related papers (2021-11-26T11:58:51Z) - Self-interpretable Convolutional Neural Networks for Text Classification [5.55878488884108]
This paper develops an approach for interpreting convolutional neural networks for text classification problems by exploiting the local-linear models inherent in ReLU-DNNs.
We show that our proposed technique produce parsimonious models that are self-interpretable and have comparable performance with respect to a more complex CNN model.
arXiv Detail & Related papers (2021-05-18T15:19:59Z) - Tensor-Train Networks for Learning Predictive Modeling of
Multidimensional Data [0.0]
A promising strategy is based on tensor networks, which have been very successful in physical and chemical applications.
We show that the weights of a multidimensional regression model can be learned by means of tensor networks with the aim of performing a powerful compact representation.
An algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational power.
arXiv Detail & Related papers (2021-01-22T16:14:38Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z) - Text-to-Image Generation with Attention Based Recurrent Neural Networks [1.2599533416395765]
We develop a tractable and stable caption-based image generation model.
Experimentations are performed on Microsoft datasets.
Results show that the proposed model performs better than contemporary approaches.
arXiv Detail & Related papers (2020-01-18T12:19:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.