Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and
Treating CNN Classifiers
- URL: http://arxiv.org/abs/2112.04934v1
- Date: Thu, 9 Dec 2021 14:05:00 GMT
- Title: Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and
Treating CNN Classifiers
- Authors: Zunlei Feng, Jiacong Hu, Sai Wu, Xiaotian Yu, Jie Song, Mingli Song
- Abstract summary: Convolutional Neural Network (CNN) has achieved excellent performance in the classification task.
It is widely known that CNN is deemed as a 'black-box', which is hard for understanding the prediction mechanism.
We propose the first completely automatic model diagnosing and treating tool, termed as Model Doctor.
- Score: 33.82339346293966
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, Convolutional Neural Network (CNN) has achieved excellent
performance in the classification task. It is widely known that CNN is deemed
as a 'black-box', which is hard for understanding the prediction mechanism and
debugging the wrong prediction. Some model debugging and explanation works are
developed for solving the above drawbacks. However, those methods focus on
explanation and diagnosing possible causes for model prediction, based on which
the researchers handle the following optimization of models manually. In this
paper, we propose the first completely automatic model diagnosing and treating
tool, termed as Model Doctor. Based on two discoveries that 1) each category is
only correlated with sparse and specific convolution kernels, and 2)
adversarial samples are isolated while normal samples are successive in the
feature space, a simple aggregate gradient constraint is devised for
effectively diagnosing and optimizing CNN classifiers. The aggregate gradient
strategy is a versatile module for mainstream CNN classifiers. Extensive
experiments demonstrate that the proposed Model Doctor applies to all existing
CNN classifiers, and improves the accuracy of $16$ mainstream CNN classifiers
by 1%-5%.
Related papers
- Reusing Convolutional Neural Network Models through Modularization and
Composition [22.823870645316397]
We propose two modularization approaches named CNNSplitter and GradSplitter.
CNNSplitter decomposes a trained convolutional neural network (CNN) model into $N$ small reusable modules.
The resulting modules can be reused to patch existing CNN models or build new CNN models through composition.
arXiv Detail & Related papers (2023-11-08T03:18:49Z) - Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for
XOR Data [24.86314525762012]
We show that ReLU CNN trained by gradient descent can achieve near Bayes-optimal accuracy.
Our result demonstrates that CNNs have a remarkable capacity to efficiently learn XOR problems, even in the presence of highly correlated features.
arXiv Detail & Related papers (2023-10-03T11:31:37Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - Convolutional Neural Network-Based Automatic Classification of
Colorectal and Prostate Tumor Biopsies Using Multispectral Imagery: System
Development Study [7.566742780233967]
We propose a CNN model for classifying colorectal and prostate tumors from multispectral images of biopsy samples.
Our results showed excellent performance, with an average test accuracy of 99.8% and 99.5% for the prostate and colorectal data sets, respectively.
The proposed CNN architecture was globally the best-performing system for classifying colorectal and prostate tumor images.
arXiv Detail & Related papers (2023-01-30T18:28:25Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Lost Vibration Test Data Recovery Using Convolutional Neural Network: A
Case Study [0.0]
This paper proposes a CNN algorithm for the Alamosa Canyon Bridge as a real structure.
Three different CNN models were considered to predict one and two malfunctioned sensors.
The accuracy of the model was increased by adding a convolutional layer.
arXiv Detail & Related papers (2022-04-11T23:24:03Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - Interpreting Graph Neural Networks for NLP With Differentiable Edge
Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models.
We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges.
We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Convolutional neural networks for classification and regression analysis
of one-dimensional spectral data [0.0]
Convolutional neural networks (CNNs) are widely used for image recognition and text analysis.
The performance of a CNN was investigated for classification and regression analysis of spectral data.
arXiv Detail & Related papers (2020-05-15T13:20:05Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.