Related papers: HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion

HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion

URL: http://arxiv.org/abs/2407.05638v2
Date: Tue, 9 Jul 2024 03:38:46 GMT
Title: HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion
Authors: Junhao Su, Chenghao He, Feiyu Zhu, Xiaojie Xu, Dongzhi Guan, Chenyang Si,
Abstract summary: We propose a novel model that performs hierarchical locally supervised learning and patch-level feature on auxiliary networks. We conduct experiments on CIFAR-10, STL-10, SVHN, and ImageNet datasets, and the results demonstrate that our proposed HPFF significantly outperforms previous approaches.
Score: 7.9514535887836795
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Traditional deep learning relies on end-to-end backpropagation for training, but it suffers from drawbacks such as high memory consumption and not aligning with biological neural networks. Recent advancements have introduced locally supervised learning, which divides networks into modules with isolated gradients and trains them locally. However, this approach can lead to performance lag due to limited interaction between these modules, and the design of auxiliary networks occupies a certain amount of GPU memory. To overcome these limitations, we propose a novel model called HPFF that performs hierarchical locally supervised learning and patch-level feature computation on the auxiliary networks. Hierarchical Locally Supervised Learning (HiLo) enables the network to learn features at different granularity levels along their respective local paths. Specifically, the network is divided into two-level local modules: independent local modules and cascade local modules. The cascade local modules combine two adjacent independent local modules, incorporating both updates within the modules themselves and information exchange between adjacent modules. Patch Feature Fusion (PFF) reduces GPU memory usage by splitting the input features of the auxiliary networks into patches for computation. By averaging these patch-level features, it enhances the network's ability to focus more on those patterns that are prevalent across multiple patches. Furthermore, our method exhibits strong generalization capabilities and can be seamlessly integrated with existing techniques. We conduct experiments on CIFAR-10, STL-10, SVHN, and ImageNet datasets, and the results demonstrate that our proposed HPFF significantly outperforms previous approaches, consistently achieving state-of-the-art performance across different datasets. Our code is available at: https://github.com/Zeudfish/HPFF.

Related papers

Momentum Auxiliary Network for Supervised Local Learning [7.5717621206854275]
Supervised local learning segments the network into multiple local blocks updated by independent auxiliary networks. We propose a Momentum Auxiliary Network (MAN) that establishes a dynamic interaction mechanism. Our method can reduce GPU memory usage by more than 45% on the ImageNet dataset compared to end-to-end training.
arXiv Detail & Related papers (2024-07-08T05:31:51Z)
MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary Network [4.396837128416218]
We propose the Multilaminar Leap Augmented Auxiliary Network (MLAAN) MLAAN captures both local and global features through independent and cascaded auxiliary networks. We further design LAM, an enhanced auxiliary network that uses the Exponential Moving Average (EMA) method to facilitate information exchange between local modules. Our experiments on the CIFAR-10, STL-10, SVHN, and ImageNet datasets show that MLAAN can be seamlessly integrated into existing local learning frameworks.
arXiv Detail & Related papers (2024-06-24T13:30:55Z)
Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation [70.43845294145714]
Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic. We propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules. Our method can be integrated into both local-BP and BP-free settings.
arXiv Detail & Related papers (2024-06-07T19:10:31Z)
Go beyond End-to-End Training: Boosting Greedy Local Learning with Context Supply [0.12187048691454236]
greedy local learning partitions the network into gradient-isolated modules and trains supervisely based on local preliminary losses. As the number of segmentations of the gradient-isolated module increases, the performance of the local learning scheme degrades substantially. We propose a ContSup scheme, which incorporates context supply between isolated modules to compensate for information loss.
arXiv Detail & Related papers (2023-12-12T10:25:31Z)
Local Learning with Neuron Groups [15.578925277062657]
Local learning is an approach to model-parallelism that removes the standard end-to-end learning setup. We study how local learning can be applied at the level of splitting layers or modules into sub-components.
arXiv Detail & Related papers (2023-01-18T16:25:10Z)
Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs) NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge. NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z)
Semi-Supervised Semantic Segmentation with Cross Teacher Training [14.015346488847902]
This work proposes a cross-teacher training framework with three modules that significantly improves traditional semi-supervised learning approaches. The core is a cross-teacher module, which could simultaneously reduce the coupling among peer networks and the error accumulation between teacher and student networks. The high-level module can transfer high-quality knowledge from labeled data to unlabeled ones and promote separation between classes in feature space. The low-level module can encourage low-quality features learning from the high-quality features among peer networks.
arXiv Detail & Related papers (2022-09-03T05:02:03Z)
PRA-Net: Point Relation-Aware Network for 3D Point Cloud Analysis [56.91758845045371]
We propose a novel framework named Point Relation-Aware Network (PRA-Net) It is composed of an Intra-region Structure Learning (ISL) module and an Inter-region Relation Learning (IRL) module. Experiments on several 3D benchmarks covering shape classification, keypoint estimation, and part segmentation have verified the effectiveness and the ability of PRA-Net.
arXiv Detail & Related papers (2021-12-09T13:24:43Z)
Global Aggregation then Local Distribution for Scene Parsing [99.1095068574454]
We show that our approach can be modularized as an end-to-end trainable block and easily plugged into existing semantic segmentation networks. Our approach allows us to build new state of the art on major semantic segmentation benchmarks including Cityscapes, ADE20K, Pascal Context, Camvid and COCO-stuff.
arXiv Detail & Related papers (2021-07-28T03:46:57Z)
Conformer: Local Features Coupling Global Representations for Visual Recognition [72.9550481476101]
We propose a hybrid network structure, termed Conformer, to take advantage of convolutional operations and self-attention mechanisms for enhanced representation learning. Experiments show that Conformer, under the comparable parameter complexity, outperforms the visual transformer (DeiT-B) by 2.3% on ImageNet.
arXiv Detail & Related papers (2021-05-09T10:00:03Z)
Neural Function Modules with Sparse Arguments: A Dynamic Approach to Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning. Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems. The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.