Are Sparse Neural Networks Better Hard Sample Learners?
- URL: http://arxiv.org/abs/2409.09196v1
- Date: Fri, 13 Sep 2024 21:12:18 GMT
- Title: Are Sparse Neural Networks Better Hard Sample Learners?
- Authors: Qiao Xiao, Boqian Wu, Lu Yin, Christopher Neil Gadzinski, Tianjin Huang, Mykola Pechenizkiy, Decebal Constantin Mocanu,
- Abstract summary: Hard samples play a crucial role in the optimal performance of deep neural networks.
Most SNNs trained on challenging samples can often match or surpass dense models in accuracy at certain sparsity levels.
- Score: 24.2141078613549
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While deep learning has demonstrated impressive progress, it remains a daunting challenge to learn from hard samples as these samples are usually noisy and intricate. These hard samples play a crucial role in the optimal performance of deep neural networks. Most research on Sparse Neural Networks (SNNs) has focused on standard training data, leaving gaps in understanding their effectiveness on complex and challenging data. This paper's extensive investigation across scenarios reveals that most SNNs trained on challenging samples can often match or surpass dense models in accuracy at certain sparsity levels, especially with limited data. We observe that layer-wise density ratios tend to play an important role in SNN performance, particularly for methods that train from scratch without pre-trained initialization. These insights enhance our understanding of SNNs' behavior and potential for efficient learning approaches in data-centric AI. Our code is publicly available at: \url{https://github.com/QiaoXiao7282/hard_sample_learners}.
Related papers
- Curriculum Learning for Graph Neural Networks: Which Edges Should We
Learn First [13.37867275976255]
We propose a novel strategy to incorporate more edges into training according to their difficulty from easy to hard.
We demonstrate the strength of our proposed method in improving the generalization ability and robustness of learned representations.
arXiv Detail & Related papers (2023-10-28T15:35:34Z) - Learning Spiking Neural Network from Easy to Hard task [1.9559989943764062]
Spiking Neural Networks (SNNs) aim to mimic the way humans process information.
Current SNNs models treat all samples equally, which does not align with the principles of human learning.
We propose a CL-SNN model that introduces Curriculum Learning into SNNs, making SNNs learn more like humans.
arXiv Detail & Related papers (2023-09-09T09:46:32Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Navigating Local Minima in Quantized Spiking Neural Networks [3.1351527202068445]
Spiking and Quantized Neural Networks (NNs) are becoming exceedingly important for hyper-efficient implementations of Deep Learning (DL) algorithms.
These networks face challenges when trained using error backpropagation, due to the absence of gradient signals when applying hard thresholds.
This paper presents a systematic evaluation of a cosine-annealed LR schedule coupled with weight-independent adaptive moment estimation.
arXiv Detail & Related papers (2022-02-15T06:42:25Z) - Iterative Pseudo-Labeling with Deep Feature Annotation and
Confidence-Based Sampling [127.46527972920383]
Training deep neural networks is challenging when large and annotated datasets are unavailable.
We improve a recent iterative pseudo-labeling technique, Deep Feature, by selecting the most confident unsupervised samples to iteratively train a deep neural network.
We first ascertain the best configuration for the baseline -- a self-trained deep neural network -- and then evaluate our confidence DeepFA for different confidence thresholds.
arXiv Detail & Related papers (2021-09-06T20:02:13Z) - SpikeMS: Deep Spiking Neural Network for Motion Segmentation [7.491944503744111]
textitSpikeMS is the first deep encoder-decoder SNN architecture for the real-world large-scale problem of motion segmentation.
We show that textitSpikeMS is capable of textitincremental predictions, or predictions from smaller amounts of test data than it is trained on.
arXiv Detail & Related papers (2021-05-13T21:34:55Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - Deep Time Delay Neural Network for Speech Enhancement with Full Data
Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z) - Self-Competitive Neural Networks [0.0]
Deep Neural Networks (DNNs) have improved the accuracy of classification problems in lots of applications.
One of the challenges in training a DNN is its need to be fed by an enriched dataset to increase its accuracy and avoid it suffering from overfitting.
Recently, researchers have worked extensively to propose methods for data augmentation.
In this paper, we generate adversarial samples to refine the Domains of Attraction (DoAs) of each class. In this approach, at each stage, we use the model learned by the primary and generated adversarial data (up to that stage) to manipulate the primary data in a way that look complicated to
arXiv Detail & Related papers (2020-08-22T12:28:35Z) - Temporal Calibrated Regularization for Robust Noisy Label Learning [60.90967240168525]
Deep neural networks (DNNs) exhibit great success on many tasks with the help of large-scale well annotated datasets.
However, labeling large-scale data can be very costly and error-prone so that it is difficult to guarantee the annotation quality.
We propose a Temporal Calibrated Regularization (TCR) in which we utilize the original labels and the predictions in the previous epoch together.
arXiv Detail & Related papers (2020-07-01T04:48:49Z) - Self-supervised Learning on Graphs: Deep Insights and New Direction [66.78374374440467]
Self-supervised learning (SSL) aims to create domain specific pretext tasks on unlabeled data.
There are increasing interests in generalizing deep learning to the graph domain in the form of graph neural networks (GNNs)
arXiv Detail & Related papers (2020-06-17T20:30:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.