Related papers: CNN LEGO: Disassembling and Assembling Convolutional Neural Network

CNN LEGO: Disassembling and Assembling Convolutional Neural Network

URL: http://arxiv.org/abs/2203.13453v1
Date: Fri, 25 Mar 2022 05:27:28 GMT
Title: CNN LEGO: Disassembling and Assembling Convolutional Neural Network
Authors: Jiacong Hu (1), Jing Gao (1), Zunlei Feng (1), Lechao Cheng (2), Jie Lei (3), Hujun Bao (1), Mingli Song (1) ((1) Zhejiang University, (2) Zhejiang Lab, (3) Zhejiang University Of Technology)
Abstract summary: Convolutional Neural Network (CNN), which mimics human visual perception mechanism, has been successfully used in many computer vision areas. Inspired by the above visual perception mechanism, we investigate a new task, termed as Model Disassembling and Assembling (MDA-Task) MDA-Task can disassemble the deep models into independent parts and assemble those parts into a new deep model without performance cost like playing LEGO toys.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional Neural Network (CNN), which mimics human visual perception mechanism, has been successfully used in many computer vision areas. Some psychophysical studies show that the visual perception mechanism synchronously processes the form, color, movement, depth, etc., in the initial stage [7,20] and then integrates all information for final recognition [38]. What's more, the human visual system [20] contains different subdivisions or different tasks. Inspired by the above visual perception mechanism, we investigate a new task, termed as Model Disassembling and Assembling (MDA-Task), which can disassemble the deep models into independent parts and assemble those parts into a new deep model without performance cost like playing LEGO toys. To this end, we propose a feature route attribution technique (FRAT) for disassembling CNN classifiers in this paper. In FRAT, the positive derivatives of predicted class probability w.r.t. the feature maps are adopted to locate the critical features in each layer. Then, relevance analysis between the critical features and preceding/subsequent parameter layers is adopted to bridge the route between two adjacent parameter layers. In the assembling phase, class-wise components of each layer are assembled into a new deep model for a specific task. Extensive experiments demonstrate that the assembled CNN classifier can achieve close accuracy with the original classifier without any fine-tune, and excess original performance with one-epoch fine-tune. What's more, we also conduct massive experiments to verify the broad application of MDA-Task on model decision route visualization, model compression, knowledge distillation, transfer learning, incremental learning, and so on.

Related papers

Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
Learning Object-Centric Representation via Reverse Hierarchy Guidance [73.05170419085796]
Object-Centric Learning (OCL) seeks to enable Neural Networks to identify individual objects in visual scenes. RHGNet introduces a top-down pathway that works in different ways in the training and inference processes. Our model achieves SOTA performance on several commonly used datasets.
arXiv Detail & Related papers (2024-05-17T07:48:27Z)
Neural Clustering based Visual Representation Learning [61.72646814537163]
Clustering is one of the most classic approaches in machine learning and data analysis. We propose feature extraction with clustering (FEC), which views feature extraction as a process of selecting representatives from data. FEC alternates between grouping pixels into individual clusters to abstract representatives and updating the deep features of pixels with current representatives.
arXiv Detail & Related papers (2024-03-26T06:04:50Z)
MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks. transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks. We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection. Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z)
Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage. We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets. By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z)
Leveraging Systematic Knowledge of 2D Transformations [6.668181653599057]
Humans have a remarkable ability to interpret images, even if the scenes in the images are rare. This work focuses on 1) the acquisition of systematic knowledge of 2D transformations, and 2) architectural components that can leverage the learned knowledge in image classification tasks.
arXiv Detail & Related papers (2022-06-02T06:46:12Z)
Do Vision Transformers See Like Convolutional Neural Networks? [45.69780772718875]
Recent work has shown that (Vision) Transformer models (ViT) can achieve comparable or even superior performance on image classification tasks. Are they acting like convolutional networks, or learning entirely different visual representations? We find striking differences between the two architectures, such as ViT having more uniform representations across all layers.
arXiv Detail & Related papers (2021-08-19T17:27:03Z)
Deep Features for training Support Vector Machine [16.795405355504077]
This paper develops a generic computer vision system based on features extracted from trained CNNs. Multiple learned features are combined into a single structure to work on different image classification tasks.
arXiv Detail & Related papers (2021-04-08T03:13:09Z)
Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks. First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts. Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z)
Eigen-CAM: Class Activation Map using Principal Components [1.2691047660244335]
This paper builds on previous ideas to cope with the increasing demand for interpretable, robust, and transparent models. The proposed Eigen-CAM computes and visualizes the principle components of the learned features/representations from the convolutional layers.
arXiv Detail & Related papers (2020-08-01T17:14:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.