Related papers: ResNet: Enabling Deep Convolutional Neural Networks through Residual Learning

ResNet: Enabling Deep Convolutional Neural Networks through Residual Learning

URL: http://arxiv.org/abs/2510.24036v1
Date: Tue, 28 Oct 2025 03:36:15 GMT
Title: ResNet: Enabling Deep Convolutional Neural Networks through Residual Learning
Authors: Xingyu Liu, Kun Ming Goh,
Abstract summary: ResNet enables the training of networks with hundreds of layers by allowing gradients to flow directly through shortcut connections.<n>In our implementation on the CIFAR-10 dataset, ResNet-18 achieves 89.9% accuracy compared to 84.1% for a traditional deep CNN of similar depth.
Score: 4.949171031381768
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Convolutional Neural Networks (CNNs) has revolutionized computer vision, but training very deep networks has been challenging due to the vanishing gradient problem. This paper explores Residual Networks (ResNet), introduced by He et al. (2015), which overcomes this limitation by using skip connections. ResNet enables the training of networks with hundreds of layers by allowing gradients to flow directly through shortcut connections that bypass intermediate layers. In our implementation on the CIFAR-10 dataset, ResNet-18 achieves 89.9% accuracy compared to 84.1% for a traditional deep CNN of similar depth, while also converging faster and training more stably.

Related papers

Hierarchical Training of Deep Neural Networks Using Early Exiting [42.186536611404165]
Deep neural networks provide state-of-the-art accuracy for vision tasks but they require significant resources for training. Deep neural networks are trained on cloud servers far from the edge devices that acquire the data. In this study, a novel hierarchical training method for deep neural networks is proposed that uses early exits in a divided architecture between edge and cloud workers.
arXiv Detail & Related papers (2023-03-04T11:30:16Z)
An Exact Mapping From ReLU Networks to Spiking Neural Networks [3.1701886344065255]
We propose an exact mapping from a network with Rectified Linear Units (ReLUs) to an SNN that fires exactly one spike per neuron. More generally our work shows that an arbitrary deep ReLU network can be replaced by an energy-efficient single-spike neural network without any loss of performance.
arXiv Detail & Related papers (2022-12-23T18:31:09Z)
Training Spiking Neural Networks with Local Tandem Learning [96.32026780517097]
Spiking neural networks (SNNs) are shown to be more biologically plausible and energy efficient than their predecessors. In this paper, we put forward a generalized learning rule, termed Local Tandem Learning (LTL) We demonstrate rapid network convergence within five training epochs on the CIFAR-10 dataset while having low computational complexity.
arXiv Detail & Related papers (2022-10-10T10:05:00Z)
Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers [83.74380713308605]
We develop a new type of transformation that is fully compatible with a variant of ReLUs -- Leaky ReLUs. We show in experiments that our method, which introduces negligible extra computational cost, validation accuracies with deep vanilla networks that are competitive with ResNets.
arXiv Detail & Related papers (2022-03-15T17:49:08Z)
Non-deep Networks [122.77755088736865]
We show that it is possible to build high-performing "non-deep" neural networks. By utilizing parallel substructures, we show that a network with a depth of just 12 can achieve top-1 accuracy over 80%. We provide a proof of concept for how non-deep networks could be used to build low-latency recognition systems.
arXiv Detail & Related papers (2021-10-14T18:03:56Z)
Layer Folding: Neural Network Depth Reduction using Activation Linearization [0.0]
Modern devices exhibit a high level of parallelism, but real-time latency is still highly dependent on networks' depth. We propose a method that learns whether non-linear activations can be removed, allowing to fold consecutive linear layers into one. We apply our method to networks pre-trained on CIFAR-10 and CIFAR-100 and find that they can all be transformed into shallower forms that share a similar depth.
arXiv Detail & Related papers (2021-06-17T08:22:46Z)
Deep Residual Learning in Spiking Neural Networks [36.16846259899793]
Spiking Neural Networks (SNNs) present optimization difficulties for gradient-based approaches. Considering the huge success of ResNet in deep learning, it would be natural to train deep SNNs with residual learning. We propose spike-element-wise (SEW) ResNet to realize residual learning in deep SNNs.
arXiv Detail & Related papers (2021-02-08T12:22:33Z)
Channel Planting for Deep Neural Networks using Knowledge Distillation [3.0165431987188245]
We present a novel incremental training algorithm for deep neural networks called planting. Our planting can search the optimal network architecture with smaller number of parameters for improving the network performance. We evaluate the effectiveness of the proposed method on different datasets such as CIFAR-10/100 and STL-10.
arXiv Detail & Related papers (2020-11-04T16:29:59Z)
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks [62.26044348366186]
We propose an efficient method to train a deep thin network with a theoretic guarantee. By training with our method, ResNet50 can outperform ResNet101, and BERT Base can be comparable with BERT Large.
arXiv Detail & Related papers (2020-07-01T23:34:35Z)
Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization. Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z)
Towards Unified INT8 Training for Convolutional Neural Network [83.15673050981624]
We build a unified 8-bit (INT8) training framework for common convolutional neural networks. First, we empirically find the four distinctive characteristics of gradients, which provide us insightful clues for gradient quantization. We propose two universal techniques, including Direction Sensitive Gradient Clipping that reduces the direction deviation of gradients.
arXiv Detail & Related papers (2019-12-29T08:37:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.