Related papers: Dilated Convolution with Learnable Spacings

Dilated Convolution with Learnable Spacings

URL: http://arxiv.org/abs/2408.06383v1
Date: Sat, 10 Aug 2024 12:12:39 GMT
Title: Dilated Convolution with Learnable Spacings
Authors: Ismail Khalfaoui-Hassani,
Abstract summary: This thesis presents and evaluates the Dilated Convolution with Learnable Spacings (DCLS) method. Through various supervised learning experiments in the fields of computer vision, audio, and speech processing, the DCLS method proves to outperform both standard and advanced convolution techniques.
Score: 1.8130068086063336
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This thesis presents and evaluates the Dilated Convolution with Learnable Spacings (DCLS) method. Through various supervised learning experiments in the fields of computer vision, audio, and speech processing, the DCLS method proves to outperform both standard and advanced convolution techniques. The research is organized into several steps, starting with an analysis of the literature and existing convolution techniques that preceded the development of the DCLS method. We were particularly interested in the methods that are closely related to our own and that remain essential to capture the nuances and uniqueness of our approach. The cornerstone of our study is the introduction and application of the DCLS method to convolutional neural networks (CNNs), as well as to hybrid architectures that rely on both convolutional and visual attention approaches. DCLS is shown to be particularly effective in tasks such as classification, semantic segmentation, and object detection. Initially using bilinear interpolation, the study also explores other interpolation methods, finding that Gaussian interpolation slightly improves performance. The DCLS method is further applied to spiking neural networks (SNNs) to enable synaptic delay learning within a neural network that could eventually be transferred to so-called neuromorphic chips. The results show that the DCLS method stands out as a new state-of-the-art technique in SNN audio classification for certain benchmark tasks in this field. These tasks involve datasets with a high temporal component. In addition, we show that DCLS can significantly improve the accuracy of artificial neural networks for the multi-label audio classification task. We conclude with a discussion of the chosen experimental setup, its limitations, the limitations of our method, and our results.

Related papers

A New Perspective on Time Series Anomaly Detection: Faster Patch-based Broad Learning System [59.38402187365612]
Time series anomaly detection (TSAD) has been a research hotspot in both academia and industry in recent years. Deep learning is not required for TSAD due to limitations such as slow deep learning speed. We propose Contrastive Patch-based Broad Learning System (CBLS)
arXiv Detail & Related papers (2024-12-07T01:58:18Z)
CRNNTL: convolutional recurrent neural network and transfer learning for QSAR modelling [4.090810719630087]
We propose the convolutional recurrent neural network and transfer learning (CRNNTL) for QSAR modelling. Our strategy takes advantages of both convolutional and recurrent neural networks for feature extraction, as well as the data augmentation method.
arXiv Detail & Related papers (2021-09-07T20:04:55Z)
A Study On the Effects of Pre-processing On Spatio-temporal Action Recognition Using Spiking Neural Networks Trained with STDP [0.0]
It is important to study the behavior of SNNs trained with unsupervised learning methods on video classification tasks. This paper presents methods of transposing temporal information into a static format, and then transforming the visual information into spikes using latency coding. We show the effect of the similarity in the shape and speed of certain actions on action recognition with spiking neural networks.
arXiv Detail & Related papers (2021-05-31T07:07:48Z)
Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes. Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z)
Drowsiness Detection Based On Driver Temporal Behavior Using a New Developed Dataset [1.8811803364757564]
We apply YOLOv3 (You Look Only Once-version3) CNN for extracting facial features automatically. Then, LSTM neural network is employed to learn driver temporal behaviors including yawning and blinking time period. Results indicate the hybrid of CNN and LSTM ability in drowsiness detection and the effectiveness of the proposed method.
arXiv Detail & Related papers (2021-03-31T21:15:29Z)
Supervised training of spiking neural networks for robust deployment on mixed-signal neuromorphic processors [2.6949002029513167]
Mixed-signal analog/digital electronic circuits can emulate spiking neurons and synapses with extremely high energy efficiency. Mismatch is expressed as differences in effective parameters between identically-configured neurons and synapses. We present a supervised learning approach that addresses this challenge by maximizing robustness to mismatch and other common sources of noise.
arXiv Detail & Related papers (2021-02-12T09:20:49Z)
A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task. The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z)
Solving Sparse Linear Inverse Problems in Communication Systems: A Deep Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems. The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z)
Continual Learning in Recurrent Neural Networks [67.05499844830231]
We evaluate the effectiveness of continual learning methods for processing sequential data with recurrent neural networks (RNNs) We shed light on the particularities that arise when applying weight-importance methods, such as elastic weight consolidation, to RNNs. We show that the performance of weight-importance methods is not directly affected by the length of the processed sequences, but rather by high working memory requirements.
arXiv Detail & Related papers (2020-06-22T10:05:12Z)
Parallelization Techniques for Verifying Neural Networks [52.917845265248744]
We introduce an algorithm based on the verification problem in an iterative manner and explore two partitioning strategies. We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems.
arXiv Detail & Related papers (2020-04-17T20:21:47Z)
Rectified Linear Postsynaptic Potential Function for Backpropagation in Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation. This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
Multi-Scale Neural network for EEG Representation Learning in BCI [2.105172041656126]
We propose a novel deep multi-scale neural network that discovers feature representations in multiple frequency/time ranges. By representing EEG signals withspectral-temporal information, the proposed method can be utilized for diverse paradigms.
arXiv Detail & Related papers (2020-03-02T04:06:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.