Related papers: CNN Filter DB: An Empirical Investigation of Trained Convolutional Filters

CNN Filter DB: An Empirical Investigation of Trained Convolutional Filters

URL: http://arxiv.org/abs/2203.15331v1
Date: Tue, 29 Mar 2022 08:25:42 GMT
Title: CNN Filter DB: An Empirical Investigation of Trained Convolutional Filters
Authors: Paul Gavrikov and Janis Keuper
Abstract summary: We show that model pre-training can succeed on arbitrary datasets if they meet size and variance conditions. We show that many pre-trained models contain degenerated filters which make them less robust and less suitable for fine-tuning on target applications.
Score: 2.0305676256390934
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Currently, many theoretical as well as practically relevant questions towards the transferability and robustness of Convolutional Neural Networks (CNNs) remain unsolved. While ongoing research efforts are engaging these problems from various angles, in most computer vision related cases these approaches can be generalized to investigations of the effects of distribution shifts in image data. In this context, we propose to study the shifts in the learned weights of trained CNN models. Here we focus on the properties of the distributions of dominantly used 3x3 convolution filter kernels. We collected and publicly provide a dataset with over 1.4 billion filters from hundreds of trained CNNs, using a wide range of datasets, architectures, and vision tasks. In a first use case of the proposed dataset, we can show highly relevant properties of many publicly available pre-trained models for practical applications: I) We analyze distribution shifts (or the lack thereof) between trained filters along different axes of meta-parameters, like visual category of the dataset, task, architecture, or layer depth. Based on these results, we conclude that model pre-training can succeed on arbitrary datasets if they meet size and variance conditions. II) We show that many pre-trained models contain degenerated filters which make them less robust and less suitable for fine-tuning on target applications. Data & Project website: https://github.com/paulgavrikov/cnn-filter-db

Related papers

The Master Key Filters Hypothesis: Deep Filters Are General [51.900488744931785]
Convolutional neural network (CNN) filters become increasingly specialized in deeper layers. Recent observations of clusterable repeating patterns in depthwise separable CNNs (DS-CNNs) trained on ImageNet motivated this paper. Our analysis of DS-CNNs reveals that deep filters maintain generality, contradicting the expected transition to class-specific filters.
arXiv Detail & Related papers (2024-12-21T20:04:23Z)
Data Filtering Networks [67.827994353269]
We study the problem of learning a data filtering network (DFN) for this second step of filtering a large uncurated dataset. Our key finding is that the quality of a network for filtering is distinct from its performance on downstream tasks. Based on our insights, we construct new data filtering networks that induce state-of-the-art image-text datasets.
arXiv Detail & Related papers (2023-09-29T17:37:29Z)
A Continuous Convolutional Trainable Filter for Modelling Unstructured Data [0.0]
We propose a continuous version of a trainable convolutional filter able to work also with unstructured data. Our experiments show that the continuous filter can achieve a level of accuracy comparable to the state-of-the-art discrete filter.
arXiv Detail & Related papers (2022-10-24T17:34:10Z)
CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance. In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z)
Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs. They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion. For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z)
An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters [2.0305676256390934]
We present first empirical results from our ongoing investigation of distribution shifts in image data used for various computer vision tasks. Instead of analyzing the original training and test data, we propose to study shifts in the learned weights of trained models.
arXiv Detail & Related papers (2022-01-20T21:48:12Z)
Single-stream CNN with Learnable Architecture for Multi-source Remote Sensing Data [16.810239678639288]
We propose an efficient framework based on deep convolutional neural network (CNN) for multi-source remote sensing data joint classification. The proposed method can theoretically adjust any modern CNN models to any multi-source remote sensing data set. Experimental results demonstrate the effectiveness of the proposed single-stream CNNs.
arXiv Detail & Related papers (2021-09-13T16:10:41Z)
Benchmarking CNN on 3D Anatomical Brain MRI: Architectures, Data Augmentation and Deep Ensemble Learning [2.1446056201053185]
We propose an extensive benchmark of recent state-of-the-art (SOTA) 3D CNN, evaluating also the benefits of data augmentation and deep ensemble learning. Experiments were conducted on a large multi-site 3D brain anatomical MRI data-set comprising N=10k scans on 3 challenging tasks: age prediction, sex classification, and schizophrenia diagnosis. We found that all models provide significantly better predictions with VBM images than quasi-raw data. DenseNet and tiny-DenseNet, a lighter version that we proposed, provide a good compromise in terms of performance in all data regime
arXiv Detail & Related papers (2021-06-02T13:00:35Z)
Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters [64.46270549587004]
Convolutional neural networks (CNNs) have been successfully used in a range of tasks. CNNs are often viewed as "black-box" and lack of interpretability. We propose a novel strategy to train interpretable CNNs by encouraging class-specific filters.
arXiv Detail & Related papers (2020-07-16T09:12:26Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
Filter Grafting for Deep Neural Networks [71.39169475500324]
Filter grafting aims to improve the representation capability of Deep Neural Networks (DNNs) We develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks. For example, the grafted MobileNetV2 outperforms the non-grafted MobileNetV2 by about 7 percent on CIFAR-100 dataset.
arXiv Detail & Related papers (2020-01-15T03:18:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.