Related papers: Alpha-Net: Architecture, Models, and Applications

Alpha-Net: Architecture, Models, and Applications

URL: http://arxiv.org/abs/2007.07221v1
Date: Sat, 27 Jun 2020 05:05:01 GMT
Title: Alpha-Net: Architecture, Models, and Applications
Authors: Jishan Shaikh, Adya Sharma, Ankit Chouhan, Avinash Mahawar
Abstract summary: We present a novel network architecture for custom training and weight evaluations. We implement Alpha-Net with 4 different layer configurations to express the architecture behavior. The Alpha-Net v3 gives improved accuracy of approx. 3% over the last state-of-the-art network ResNet 50 on ImageNet benchmark.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning network training is usually computationally expensive and intuitively complex. We present a novel network architecture for custom training and weight evaluations. We reformulate the layers as ResNet-similar blocks with certain inputs and outputs of their own, the blocks (called Alpha blocks) on their connection configuration form their own network, combined with our novel loss function and normalization function form the complete Alpha-Net architecture. We provided the empirical mathematical formulation of network loss function for more understanding of accuracy estimation and further optimizations. We implemented Alpha-Net with 4 different layer configurations to express the architecture behavior comprehensively. On a custom dataset based on ImageNet benchmark, we evaluate Alpha-Net v1, v2, v3, and v4 for image recognition to give the accuracy of 78.2%, 79.1%, 79.5%, and 78.3% respectively. The Alpha-Net v3 gives improved accuracy of approx. 3% over the last state-of-the-art network ResNet 50 on ImageNet benchmark. We also present an analysis of our dataset with 256, 512, and 1024 layers and different versions of the loss function. Input representation is also crucial for training as initial preprocessing will take only a handful of features to make training less complex than it needs to be. We also compared network behavior with different layer structures, different loss functions, and different normalization functions for better quantitative modeling of Alpha-Net.

Related papers

Connection Reduction Is All You Need [0.10878040851637998]
Empirical research shows that simply stacking convolutional layers does not make the network train better. We propose two new algorithms to connect layers. ShortNet1 has a 5% lower test error rate and 25% faster inference time than Baseline.
arXiv Detail & Related papers (2022-08-02T13:00:35Z)
FlowNAS: Neural Architecture Search for Optical Flow Estimation [65.44079917247369]
We propose a neural architecture search method named FlowNAS to automatically find the better encoder architecture for flow estimation task. Experimental results show that the discovered architecture with the weights inherited from the super-network achieves 4.67% F1-all error on KITTI.
arXiv Detail & Related papers (2022-07-04T09:05:25Z)
SmoothNets: Optimizing CNN architecture design for differentially private deep learning [69.10072367807095]
DPSGD requires clipping and noising of per-sample gradients. This introduces a reduction in model utility compared to non-private training. We distilled a new model architecture termed SmoothNet, which is characterised by increased robustness to the challenges of DP-SGD training.
arXiv Detail & Related papers (2022-05-09T07:51:54Z)
Dynamic Resolution Network [40.64164953983429]
The redundancy on the input resolution of modern CNNs has not been fully investigated. We propose a novel dynamic-resolution network (DRNet) in which the resolution is determined dynamically based on each input sample. DRNet achieves similar performance with an about 34% reduction, while gains 1.4% accuracy increase with 10% reduction compared to the original ResNet-50 on ImageNet.
arXiv Detail & Related papers (2021-06-05T13:48:33Z)
Wise-SrNet: A Novel Architecture for Enhancing Image Classification by Learning Spatial Resolution of Feature Maps [0.5892638927736115]
One of the main challenges since the advancement of convolutional neural networks is how to connect the extracted feature map to the final classification layer. In this paper, we aim to tackle this problem by replacing the GAP layer with a new architecture called Wise-SrNet. It is inspired by the depthwise convolutional idea and is designed for processing spatial resolution while not increasing computational cost.
arXiv Detail & Related papers (2021-04-26T00:37:11Z)
Joint Learning of Neural Transfer and Architecture Adaptation for Image Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset. In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness. Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z)
Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths. Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z)
MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks [57.69809561405253]
We introduce a framework that is able to boost the vanilla ResNet-50 to 80%+ Top-1 accuracy on ImageNet without tricks. Our method obtains 80.67% top-1 accuracy on ImageNet using a single crop-size of 224x224 with vanilla ResNet-50. Our framework consistently improves from 69.76% to 73.19% on smaller ResNet-18.
arXiv Detail & Related papers (2020-09-17T17:59:33Z)
Improved Residual Networks for Image and Video Recognition [98.10703825716142]
Residual networks (ResNets) represent a powerful type of convolutional neural network (CNN) architecture. We show consistent improvements in accuracy and learning convergence over the baseline. Our proposed approach allows us to train extremely deep networks, while the baseline shows severe optimization issues.
arXiv Detail & Related papers (2020-04-10T11:09:50Z)
Impact of ImageNet Model Selection on Domain Adaptation [26.016647703500883]
We investigate how different ImageNet models affect transfer accuracy on domain adaptation problems. A higher accuracy ImageNet model produces better features, and leads to higher accuracy on domain adaptation problems. We also examine the architecture of each neural network to find the best layer for feature extraction.
arXiv Detail & Related papers (2020-02-06T23:58:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.