Feature Aggregation in Joint Sound Classification and Localization
Neural Networks
- URL: http://arxiv.org/abs/2310.19063v2
- Date: Sat, 27 Jan 2024 20:45:13 GMT
- Title: Feature Aggregation in Joint Sound Classification and Localization
Neural Networks
- Authors: Brendan Healy, Patrick McNamee, and Zahra Nili Ahmadabadi
- Abstract summary: Current state-of-the-art sound source localization deep learning networks lack feature aggregation within their architecture.
We adapt feature aggregation techniques from computer vision neural networks to signal detection neural networks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study addresses the application of deep learning techniques in joint
sound signal classification and localization networks. Current state-of-the-art
sound source localization deep learning networks lack feature aggregation
within their architecture. Feature aggregation enhances model performance by
enabling the consolidation of information from different feature scales,
thereby improving feature robustness and invariance. This is particularly
important in SSL networks, which must differentiate direct and indirect
acoustic signals. To address this gap, we adapt feature aggregation techniques
from computer vision neural networks to signal detection neural networks.
Additionally, we propose the Scale Encoding Network (SEN) for feature
aggregation to encode features from various scales, compressing the network for
more computationally efficient aggregation. To evaluate the efficacy of feature
aggregation in SSL networks, we integrated the following computer vision
feature aggregation sub-architectures into a SSL control architecture: Path
Aggregation Network (PANet), Weighted Bi-directional Feature Pyramid Network
(BiFPN), and SEN. These sub-architectures were evaluated using two metrics for
signal classification and two metrics for direction-of-arrival regression.
PANet and BiFPN are established aggregators in computer vision models, while
the proposed SEN is a more compact aggregator. The results suggest that models
incorporating feature aggregations outperformed the control model, the Sound
Event Localization and Detection network (SELDnet), in both sound signal
classification and localization. The feature aggregation techniques enhance the
performance of sound detection neural networks, particularly in
direction-of-arrival regression.
Related papers
- Hybrid Convolutional and Attention Network for Hyperspectral Image Denoising [54.110544509099526]
Hyperspectral image (HSI) denoising is critical for the effective analysis and interpretation of hyperspectral data.
We propose a hybrid convolution and attention network (HCANet) to enhance HSI denoising.
Experimental results on mainstream HSI datasets demonstrate the rationality and effectiveness of the proposed HCANet.
arXiv Detail & Related papers (2024-03-15T07:18:43Z) - BLIS-Net: Classifying and Analyzing Signals on Graphs [20.345611294709244]
Graph neural networks (GNNs) have emerged as a powerful tool for tasks such as node classification and graph classification.
We introduce the BLIS-Net (Bi-Lipschitz Scattering Net), a novel GNN that builds on the previously introduced geometric scattering transform.
We show that BLIS-Net achieves superior performance on both synthetic and real-world data sets based on traffic flow and fMRI data.
arXiv Detail & Related papers (2023-10-26T17:03:14Z) - An Efficient Speech Separation Network Based on Recurrent Fusion Dilated
Convolution and Channel Attention [0.2538209532048866]
We present an efficient speech separation neural network, ARFDCN, which combines dilated convolutions, multi-scale fusion (MSF), and channel attention.
Experimental results indicate that the model achieves a decent balance between performance and computational efficiency.
arXiv Detail & Related papers (2023-06-09T13:30:27Z) - An error-propagation spiking neural network compatible with neuromorphic
processors [2.432141667343098]
We present a spike-based learning method that approximates back-propagation using local weight update mechanisms.
We introduce a network architecture that enables synaptic weight update mechanisms to back-propagate error signals.
This work represents a first step towards the design of ultra-low power mixed-signal neuromorphic processing systems.
arXiv Detail & Related papers (2021-04-12T07:21:08Z) - PredRNN: A Recurrent Neural Network for Spatiotemporal Predictive
Learning [109.84770951839289]
We present PredRNN, a new recurrent network for learning visual dynamics from historical context.
We show that our approach obtains highly competitive results on three standard datasets.
arXiv Detail & Related papers (2021-03-17T08:28:30Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z) - Attentional Local Contrast Networks for Infrared Small Target Detection [15.882749652217653]
We propose a novel model-driven deep network for infrared small target detection.
We modularize a conventional local contrast measure method as a depth-wise parameterless nonlinear feature refinement layer in an end-to-end network.
We conduct detailed ablation studies with varying network depths to empirically verify the effectiveness and efficiency of each component in our network architecture.
arXiv Detail & Related papers (2020-12-15T19:33:09Z) - The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network
Architectures [179.66117325866585]
We investigate a design space that is usually overlooked, i.e. adjusting the channel configurations of predefined networks.
We find that this adjustment can be achieved by shrinking widened baseline networks and leads to superior performance.
Experiments are conducted on various networks and datasets for image classification, visual tracking and image restoration.
arXiv Detail & Related papers (2020-06-29T17:59:26Z) - ReMarNet: Conjoint Relation and Margin Learning for Small-Sample Image
Classification [49.87503122462432]
We introduce a novel neural network termed Relation-and-Margin learning Network (ReMarNet)
Our method assembles two networks of different backbones so as to learn the features that can perform excellently in both of the aforementioned two classification mechanisms.
Experiments on four image datasets demonstrate that our approach is effective in learning discriminative features from a small set of labeled samples.
arXiv Detail & Related papers (2020-06-27T13:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.