An Empirical Investigation of Model-to-Model Distribution Shifts in
Trained Convolutional Filters
- URL: http://arxiv.org/abs/2201.08465v1
- Date: Thu, 20 Jan 2022 21:48:12 GMT
- Title: An Empirical Investigation of Model-to-Model Distribution Shifts in
Trained Convolutional Filters
- Authors: Paul Gavrikov, Janis Keuper
- Abstract summary: We present first empirical results from our ongoing investigation of distribution shifts in image data used for various computer vision tasks.
Instead of analyzing the original training and test data, we propose to study shifts in the learned weights of trained models.
- Score: 2.0305676256390934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present first empirical results from our ongoing investigation of
distribution shifts in image data used for various computer vision tasks.
Instead of analyzing the original training and test data, we propose to study
shifts in the learned weights of trained models. In this work, we focus on the
properties of the distributions of dominantly used 3x3 convolution filter
kernels. We collected and publicly provide a data set with over half a billion
filters from hundreds of trained CNNs, using a wide range of data sets,
architectures, and vision tasks. Our analysis shows interesting distribution
shifts (or the lack thereof) between trained filters along different axes of
meta-parameters, like data type, task, architecture, or layer depth. We argue,
that the observed properties are a valuable source for further investigation
into a better understanding of the impact of shifts in the input data to the
generalization abilities of CNN models and novel methods for more robust
transfer-learning in this domain. Data available at:
https://github.com/paulgavrikov/CNN-Filter-DB/.
Related papers
- The Master Key Filters Hypothesis: Deep Filters Are General [51.900488744931785]
Convolutional neural network (CNN) filters become increasingly specialized in deeper layers.
Recent observations of clusterable repeating patterns in depthwise separable CNNs (DS-CNNs) trained on ImageNet motivated this paper.
Our analysis of DS-CNNs reveals that deep filters maintain generality, contradicting the expected transition to class-specific filters.
arXiv Detail & Related papers (2024-12-21T20:04:23Z) - Explaining Model Overfitting in CNNs via GMM Clustering [11.9346565927116]
Convolutional Neural Networks (CNNs) have demonstrated remarkable prowess in the field of computer vision.
However, their opaque decision-making processes pose significant challenges for practical applications.
We provide quantitative metrics for assessing CNN filters by clustering the feature maps corresponding to individual filters in the model.
arXiv Detail & Related papers (2024-12-12T08:13:18Z) - Data Filtering Networks [67.827994353269]
We study the problem of learning a data filtering network (DFN) for this second step of filtering a large uncurated dataset.
Our key finding is that the quality of a network for filtering is distinct from its performance on downstream tasks.
Based on our insights, we construct new data filtering networks that induce state-of-the-art image-text datasets.
arXiv Detail & Related papers (2023-09-29T17:37:29Z) - The Devil is in the Details: A Deep Dive into the Rabbit Hole of Data
Filtering [23.68112988933411]
This paper describes our learning and solution when participating in the DataComp challenge.
Our filtering strategy includes three stages: single-modality filtering, cross-modality filtering, and data distribution alignment.
Our approach outperforms the best method from the DataComp paper by over 4% on the average performance of 38 tasks and by over 2% on ImageNet.
arXiv Detail & Related papers (2023-09-27T19:10:43Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - An Empirical Study on Distribution Shift Robustness From the Perspective
of Pre-Training and Data Augmentation [91.62129090006745]
This paper studies the distribution shift problem from the perspective of pre-training and data augmentation.
We provide the first comprehensive empirical study focusing on pre-training and data augmentation.
arXiv Detail & Related papers (2022-05-25T13:04:53Z) - CNN Filter DB: An Empirical Investigation of Trained Convolutional
Filters [2.0305676256390934]
We show that model pre-training can succeed on arbitrary datasets if they meet size and variance conditions.
We show that many pre-trained models contain degenerated filters which make them less robust and less suitable for fine-tuning on target applications.
arXiv Detail & Related papers (2022-03-29T08:25:42Z) - Benchmarking CNN on 3D Anatomical Brain MRI: Architectures, Data
Augmentation and Deep Ensemble Learning [2.1446056201053185]
We propose an extensive benchmark of recent state-of-the-art (SOTA) 3D CNN, evaluating also the benefits of data augmentation and deep ensemble learning.
Experiments were conducted on a large multi-site 3D brain anatomical MRI data-set comprising N=10k scans on 3 challenging tasks: age prediction, sex classification, and schizophrenia diagnosis.
We found that all models provide significantly better predictions with VBM images than quasi-raw data.
DenseNet and tiny-DenseNet, a lighter version that we proposed, provide a good compromise in terms of performance in all data regime
arXiv Detail & Related papers (2021-06-02T13:00:35Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z) - Curriculum By Smoothing [52.08553521577014]
Convolutional Neural Networks (CNNs) have shown impressive performance in computer vision tasks such as image classification, detection, and segmentation.
We propose an elegant curriculum based scheme that smoothes the feature embedding of a CNN using anti-aliasing or low-pass filters.
As the amount of information in the feature maps increases during training, the network is able to progressively learn better representations of the data.
arXiv Detail & Related papers (2020-03-03T07:27:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.