CNN Filter DB: An Empirical Investigation of Trained Convolutional
Filters
- URL: http://arxiv.org/abs/2203.15331v1
- Date: Tue, 29 Mar 2022 08:25:42 GMT
- Title: CNN Filter DB: An Empirical Investigation of Trained Convolutional
Filters
- Authors: Paul Gavrikov and Janis Keuper
- Abstract summary: We show that model pre-training can succeed on arbitrary datasets if they meet size and variance conditions.
We show that many pre-trained models contain degenerated filters which make them less robust and less suitable for fine-tuning on target applications.
- Score: 2.0305676256390934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Currently, many theoretical as well as practically relevant questions towards
the transferability and robustness of Convolutional Neural Networks (CNNs)
remain unsolved. While ongoing research efforts are engaging these problems
from various angles, in most computer vision related cases these approaches can
be generalized to investigations of the effects of distribution shifts in image
data. In this context, we propose to study the shifts in the learned weights of
trained CNN models. Here we focus on the properties of the distributions of
dominantly used 3x3 convolution filter kernels. We collected and publicly
provide a dataset with over 1.4 billion filters from hundreds of trained CNNs,
using a wide range of datasets, architectures, and vision tasks. In a first use
case of the proposed dataset, we can show highly relevant properties of many
publicly available pre-trained models for practical applications: I) We analyze
distribution shifts (or the lack thereof) between trained filters along
different axes of meta-parameters, like visual category of the dataset, task,
architecture, or layer depth. Based on these results, we conclude that model
pre-training can succeed on arbitrary datasets if they meet size and variance
conditions. II) We show that many pre-trained models contain degenerated
filters which make them less robust and less suitable for fine-tuning on target
applications.
Data & Project website: https://github.com/paulgavrikov/cnn-filter-db
Related papers
- Data Filtering Networks [67.827994353269]
We study the problem of learning a data filtering network (DFN) for this second step of filtering a large uncurated dataset.
Our key finding is that the quality of a network for filtering is distinct from its performance on downstream tasks.
Based on our insights, we construct new data filtering networks that induce state-of-the-art image-text datasets.
arXiv Detail & Related papers (2023-09-29T17:37:29Z) - A Continuous Convolutional Trainable Filter for Modelling Unstructured
Data [0.0]
We propose a continuous version of a trainable convolutional filter able to work also with unstructured data.
Our experiments show that the continuous filter can achieve a level of accuracy comparable to the state-of-the-art discrete filter.
arXiv Detail & Related papers (2022-10-24T17:34:10Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - Focal Sparse Convolutional Networks for 3D Object Detection [121.45950754511021]
We introduce two new modules to enhance the capability of Sparse CNNs.
They are focal sparse convolution (Focals Conv) and its multi-modal variant of focal sparse convolution with fusion.
For the first time, we show that spatially learnable sparsity in sparse convolution is essential for sophisticated 3D object detection.
arXiv Detail & Related papers (2022-04-26T17:34:10Z) - An Empirical Investigation of Model-to-Model Distribution Shifts in
Trained Convolutional Filters [2.0305676256390934]
We present first empirical results from our ongoing investigation of distribution shifts in image data used for various computer vision tasks.
Instead of analyzing the original training and test data, we propose to study shifts in the learned weights of trained models.
arXiv Detail & Related papers (2022-01-20T21:48:12Z) - Single-stream CNN with Learnable Architecture for Multi-source Remote
Sensing Data [16.810239678639288]
We propose an efficient framework based on deep convolutional neural network (CNN) for multi-source remote sensing data joint classification.
The proposed method can theoretically adjust any modern CNN models to any multi-source remote sensing data set.
Experimental results demonstrate the effectiveness of the proposed single-stream CNNs.
arXiv Detail & Related papers (2021-09-13T16:10:41Z) - Benchmarking CNN on 3D Anatomical Brain MRI: Architectures, Data
Augmentation and Deep Ensemble Learning [2.1446056201053185]
We propose an extensive benchmark of recent state-of-the-art (SOTA) 3D CNN, evaluating also the benefits of data augmentation and deep ensemble learning.
Experiments were conducted on a large multi-site 3D brain anatomical MRI data-set comprising N=10k scans on 3 challenging tasks: age prediction, sex classification, and schizophrenia diagnosis.
We found that all models provide significantly better predictions with VBM images than quasi-raw data.
DenseNet and tiny-DenseNet, a lighter version that we proposed, provide a good compromise in terms of performance in all data regime
arXiv Detail & Related papers (2021-06-02T13:00:35Z) - Training Interpretable Convolutional Neural Networks by Differentiating
Class-specific Filters [64.46270549587004]
Convolutional neural networks (CNNs) have been successfully used in a range of tasks.
CNNs are often viewed as "black-box" and lack of interpretability.
We propose a novel strategy to train interpretable CNNs by encouraging class-specific filters.
arXiv Detail & Related papers (2020-07-16T09:12:26Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - Filter Grafting for Deep Neural Networks [71.39169475500324]
Filter grafting aims to improve the representation capability of Deep Neural Networks (DNNs)
We develop an entropy-based criterion to measure the information of filters and an adaptive weighting strategy for balancing the grafted information among networks.
For example, the grafted MobileNetV2 outperforms the non-grafted MobileNetV2 by about 7 percent on CIFAR-100 dataset.
arXiv Detail & Related papers (2020-01-15T03:18:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.