Related papers: SA-CNN: Application to text categorization issues using simulated annealing-based convolutional neural network optimization

SA-CNN: Application to text categorization issues using simulated annealing-based convolutional neural network optimization

URL: http://arxiv.org/abs/2303.07153v1
Date: Mon, 13 Mar 2023 14:27:34 GMT
Title: SA-CNN: Application to text categorization issues using simulated annealing-based convolutional neural network optimization
Authors: Zihao Guo and Yueying Cao
Abstract summary: Convolutional neural networks (CNNs) are a representative class of deep learning algorithms. We introduce SA-CNN neural networks for text classification tasks based on Text-CNN neural networks.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional neural networks (CNNs) are a representative class of deep learning algorithms including convolutional computation that perform translation-invariant classification of input data based on their hierarchical architecture. However, classical convolutional neural network learning methods use the steepest descent algorithm for training, and the learning performance is greatly influenced by the initial weight settings of the convolutional and fully connected layers, requiring re-tuning to achieve better performance under different model structures and data. Combining the strengths of the simulated annealing algorithm in global search, we propose applying it to the hyperparameter search process in order to increase the effectiveness of convolutional neural networks (CNNs). In this paper, we introduce SA-CNN neural networks for text classification tasks based on Text-CNN neural networks and implement the simulated annealing algorithm for hyperparameter search. Experiments demonstrate that we can achieve greater classification accuracy than earlier models with manual tuning, and the improvement in time and space for exploration relative to human tuning is substantial.

Related papers

Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks [53.31941519245432]
Brain-inspired spiking neural networks (SNNs) have demonstrated promising capabilities in solving pattern recognition tasks. These SNNs are grounded on homogeneous neurons that utilize a uniform neural coding for information representation. In this study, we argue that SNN architectures should be holistically designed to incorporate heterogeneous coding schemes.
arXiv Detail & Related papers (2023-05-26T02:52:12Z)
Variational Tensor Neural Networks for Deep Learning [0.0]
We propose an integration of tensor networks (TN) into deep neural networks (NNs) This in turn, results in a scalable tensor neural network (TNN) architecture capable of efficient training over a large parameter space. We validate the accuracy and efficiency of our method by designing TNN models and providing benchmark results for linear and non-linear regressions, data classification and image recognition on MNIST handwritten digits.
arXiv Detail & Related papers (2022-11-26T20:24:36Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience. We show that sparse coding can effectively maximize the entropy of the output signals. Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z)
Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks. We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z)
Differentiable Neural Architecture Learning for Efficient Neural Network Design [31.23038136038325]
We introduce a novel emph architecture parameterisation based on scaled sigmoid function. We then propose a general emphiable Neural Architecture Learning (DNAL) method to optimize the neural architecture without the need to evaluate candidate neural networks.
arXiv Detail & Related papers (2021-03-03T02:03:08Z)
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks. We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Sampled Training and Node Inheritance for Fast Evolutionary Neural Architecture Search [22.483917379706725]
evolutionary neural architecture search (ENAS) has received increasing attention due to the attractive global optimization capability of evolutionary algorithms. This paper proposes a new framework for fast ENAS based on directed acyclic graph, in which parents are randomly sampled and trained on each mini-batch of training data. We evaluate the proposed algorithm on the widely used datasets, in comparison with 26 state-of-the-art peer algorithms.
arXiv Detail & Related papers (2020-03-07T12:33:01Z)
MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. The use of gradient combined nonvolutionity renders learning susceptible to novel problems. We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.