Related papers: Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and Treating CNN Classifiers

Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and Treating CNN Classifiers

URL: http://arxiv.org/abs/2112.04934v1
Date: Thu, 9 Dec 2021 14:05:00 GMT
Title: Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and Treating CNN Classifiers
Authors: Zunlei Feng, Jiacong Hu, Sai Wu, Xiaotian Yu, Jie Song, Mingli Song
Abstract summary: Convolutional Neural Network (CNN) has achieved excellent performance in the classification task. It is widely known that CNN is deemed as a 'black-box', which is hard for understanding the prediction mechanism. We propose the first completely automatic model diagnosing and treating tool, termed as Model Doctor.
Score: 33.82339346293966
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, Convolutional Neural Network (CNN) has achieved excellent performance in the classification task. It is widely known that CNN is deemed as a 'black-box', which is hard for understanding the prediction mechanism and debugging the wrong prediction. Some model debugging and explanation works are developed for solving the above drawbacks. However, those methods focus on explanation and diagnosing possible causes for model prediction, based on which the researchers handle the following optimization of models manually. In this paper, we propose the first completely automatic model diagnosing and treating tool, termed as Model Doctor. Based on two discoveries that 1) each category is only correlated with sparse and specific convolution kernels, and 2) adversarial samples are isolated while normal samples are successive in the feature space, a simple aggregate gradient constraint is devised for effectively diagnosing and optimizing CNN classifiers. The aggregate gradient strategy is a versatile module for mainstream CNN classifiers. Extensive experiments demonstrate that the proposed Model Doctor applies to all existing CNN classifiers, and improves the accuracy of $16$ mainstream CNN classifiers by 1%-5%.

Related papers

Soft-CAM: Making black box models self-explainable for high-stakes decisions [6.635611625764804]
Convolutional neural networks (CNNs) are widely used for high-stakes applications like medicine, often surpassing human performance.<n>Most explanation methods rely on post-hoc attribution, approximating the decision-making process of already trained black-box models.<n>We introduce SoftCAM, a straightforward yet effective approach that makes standard CNN architectures inherently interpretable.
arXiv Detail & Related papers (2025-05-23T11:15:21Z)
Reusing Convolutional Neural Network Models through Modularization and Composition [22.823870645316397]
We propose two modularization approaches named CNNSplitter and GradSplitter. CNNSplitter decomposes a trained convolutional neural network (CNN) model into $N$ small reusable modules. The resulting modules can be reused to patch existing CNN models or build new CNN models through composition.
arXiv Detail & Related papers (2023-11-08T03:18:49Z)
Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data [24.86314525762012]
We show that ReLU CNN trained by gradient descent can achieve near Bayes-optimal accuracy. Our result demonstrates that CNNs have a remarkable capacity to efficiently learn XOR problems, even in the presence of highly correlated features.
arXiv Detail & Related papers (2023-10-03T11:31:37Z)
Continuous time recurrent neural networks: overview and application to forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations. We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z)
Convolutional Neural Network-Based Automatic Classification of Colorectal and Prostate Tumor Biopsies Using Multispectral Imagery: System Development Study [7.566742780233967]
We propose a CNN model for classifying colorectal and prostate tumors from multispectral images of biopsy samples. Our results showed excellent performance, with an average test accuracy of 99.8% and 99.5% for the prostate and colorectal data sets, respectively. The proposed CNN architecture was globally the best-performing system for classifying colorectal and prostate tumor images.
arXiv Detail & Related papers (2023-01-30T18:28:25Z)
On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules. We study the generalization and adaption performance of such modular neural causal models. Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z)
Lost Vibration Test Data Recovery Using Convolutional Neural Network: A Case Study [0.0]
This paper proposes a CNN algorithm for the Alamosa Canyon Bridge as a real structure. Three different CNN models were considered to predict one and two malfunctioned sensors. The accuracy of the model was increased by adding a convolutional layer.
arXiv Detail & Related papers (2022-04-11T23:24:03Z)
Do We Really Need a Learnable Classifier at the End of Deep Neural Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training. Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z)
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models. We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges. We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z)
ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN. We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z)
Convolutional neural networks for classification and regression analysis of one-dimensional spectral data [0.0]
Convolutional neural networks (CNNs) are widely used for image recognition and text analysis. The performance of a CNN was investigated for classification and regression analysis of spectral data.
arXiv Detail & Related papers (2020-05-15T13:20:05Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.