Related papers: Interpreting intermediate convolutional layers in unsupervised acoustic word classification

Interpreting intermediate convolutional layers in unsupervised acoustic word classification

URL: http://arxiv.org/abs/2110.02375v1
Date: Tue, 5 Oct 2021 21:53:32 GMT
Title: Interpreting intermediate convolutional layers in unsupervised acoustic word classification
Authors: Ga\v{s}per Begu\v{s}, Alan Zhou
Abstract summary: This paper proposes a technique to visualize and interpret intermediate layers of unsupervised deep convolutional neural networks. A GAN-based architecture (ciwGAN arXiv:2006.02951) was trained on unlabeled sliced lexical items from TIMIT.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding how deep convolutional neural networks classify data has been subject to extensive research. This paper proposes a technique to visualize and interpret intermediate layers of unsupervised deep convolutional neural networks by averaging over individual feature maps in each convolutional layer and inferring underlying distributions of words with non-linear regression techniques. A GAN-based architecture (ciwGAN arXiv:2006.02951) that includes three convolutional networks (a Generator, a Discriminator, and a classifier) was trained on unlabeled sliced lexical items from TIMIT. The training results in a deep convolutional network that learns to classify words into discrete classes only from the requirement of the Generator to output informative data. The classifier network has no access to the training data -- only to the generated data -- which means lexical learning needs to emerge in a fully unsupervised manner. We propose a technique to visualize individual convolutional layers in the classifier that yields highly informative time-series data for each convolutional layer and apply it to unobserved test data. Using non-linear regression, we infer underlying distributions for each word which allows us to analyze both absolute values and shapes of individual words at different convolutional layers as well as perform hypothesis testing on their acoustic properties. The technique also allows us to tests individual phone contrasts and how they are represented at each layer.

Related papers

Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers [0.0]
We investigate the impact on deep network performances of a training approach. We propose a neural network architecture which induces an error function involving the outputs of all the network layers.
arXiv Detail & Related papers (2023-06-09T10:52:49Z)
The SVD of Convolutional Weights: A CNN Interpretability Framework [3.5783190448496343]
We propose a framework against which interpretability methods might be applied using hypergraphs to model class separation. Rather than looking to the activations to explain the network, we use the singular vectors with the greatest corresponding singular values for each linear layer to identify those features most important to the network.
arXiv Detail & Related papers (2022-08-14T18:23:02Z)
Do We Really Need a Learnable Classifier at the End of Deep Neural Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training. Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z)
Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution. Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences. We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training. We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively. Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z)
Interpreting intermediate convolutional layers of CNNs trained on raw speech [0.0]
We show that averaging over feature maps after ReLU activation in each convolutional layer yields interpretable time-series data. The proposed technique enables acoustic analysis of intermediate convolutional layers.
arXiv Detail & Related papers (2021-04-19T17:52:06Z)
SparseGAN: Sparse Generative Adversarial Network for Text Generation [8.634962333084724]
We propose a SparseGAN that generates semantic-interpretable, but sparse sentence representations as inputs to the discriminator. With such semantic-rich representations, we not only reduce unnecessary noises for efficient adversarial training, but also make the entire training process fully differentiable.
arXiv Detail & Related papers (2021-03-22T04:44:43Z)
Learning from Incomplete Features by Simultaneous Training of Neural Networks and Sparse Coding [24.3769047873156]
This paper addresses the problem of training a classifier on a dataset with incomplete features. We assume that different subsets of features (random or structured) are available at each data instance. A new supervised learning method is developed to train a general classifier, using only a subset of features per sample.
arXiv Detail & Related papers (2020-11-28T02:20:39Z)
Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning. We propose a novel method of using data augmentations when training autoencoders. We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z)
Ensemble Wrapper Subsampling for Deep Modulation Classification [70.91089216571035]
Subsampling of received wireless signals is important for relaxing hardware requirements as well as the computational cost of signal processing algorithms. We propose a subsampling technique to facilitate the use of deep learning for automatic modulation classification in wireless communication systems.
arXiv Detail & Related papers (2020-05-10T06:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.