Related papers: A Technique for Isolating Lexically-Independent Phonetic Dependencies in Generative CNNs

A Technique for Isolating Lexically-Independent Phonetic Dependencies in Generative CNNs

URL: http://arxiv.org/abs/2506.09218v1
Date: Tue, 10 Jun 2025 20:22:33 GMT
Title: A Technique for Isolating Lexically-Independent Phonetic Dependencies in Generative CNNs
Authors: Bruno Ferenc Šegedin,
Abstract summary: The ability of deep neural networks (DNNs) to represent phonotactic generalizations derived from lexical learning remains an open question.<n>This study investigates the lexically-invariant generalization capacity of generative convolutional neural networks (CNNs) trained on raw audio waveforms of lexical items.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The ability of deep neural networks (DNNs) to represent phonotactic generalizations derived from lexical learning remains an open question. This study (1) investigates the lexically-invariant generalization capacity of generative convolutional neural networks (CNNs) trained on raw audio waveforms of lexical items and (2) explores the consequences of shrinking the fully-connected layer (FC) bottleneck from 1024 channels to 8 before training. Ultimately, a novel technique for probing a model's lexically-independent generalizations is proposed that works only under the narrow FC bottleneck: generating audio outputs by bypassing the FC and inputting randomized feature maps into the convolutional block. These outputs are equally biased by a phonotactic restriction in training as are outputs generated with the FC. This result shows that the convolutional layers can dynamically generalize phonetic dependencies beyond lexically-constrained configurations learned by the FC.

Related papers

Exploring the encoding of linguistic representations in the Fully-Connected Layer of generative CNNs for Speech [0.0]
This study presents the first exploration of how the fully connected layer of CNNs for speech synthesis encodes linguistically relevant information.<n>We show that lexically specific latent codes in generative CNNs (ciwGAN) have shared lexically invariant sublexical representations in the FC-layer weights.
arXiv Detail & Related papers (2025-01-13T22:24:52Z)
LipKernel: Lipschitz-Bounded Convolutional Neural Networks via Dissipative Layers [0.0468732641979009]
We propose a layer-wise parameterization for convolutional neural networks (CNNs) that includes built-in robustness guarantees. Our method Lip Kernel directly parameterizes dissipative convolution kernels using a 2-D Roesser-type state space model. We show that the run-time using our method is orders of magnitude faster than state-of-the-art Lipschitz-bounded networks.
arXiv Detail & Related papers (2024-10-29T17:20:14Z)
The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes. In this paper we examine the use of convex neural recovery models. We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program. We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z)
Semantics Alignment via Split Learning for Resilient Multi-User Semantic Communication [56.54422521327698]
Recent studies on semantic communication rely on neural network (NN) based transceivers such as deep joint source and channel coding (DeepJSCC) Unlike traditional transceivers, these neural transceivers are trainable using actual source data and channels, enabling them to extract and communicate semantics. We propose a distributed learning based solution, which leverages split learning (SL) and partial NN fine-tuning techniques.
arXiv Detail & Related papers (2023-10-13T20:29:55Z)
Speed Limits for Deep Learning [67.69149326107103]
Recent advancement in thermodynamics allows bounding the speed at which one can go from the initial weight distribution to the final distribution of the fully trained network. We provide analytical expressions for these speed limits for linear and linearizable neural networks. Remarkably, given some plausible scaling assumptions on the NTK spectra and spectral decomposition of the labels -- learning is optimal in a scaling sense.
arXiv Detail & Related papers (2023-07-27T06:59:46Z)
Local Kernel Renormalization as a mechanism for feature learning in overparametrized Convolutional Neural Networks [0.0]
Empirical evidence shows that fully-connected neural networks in the infinite-width limit eventually outperform their finite-width counterparts. State-of-the-art architectures with convolutional layers achieve optimal performances in the finite-width regime. We show that the generalization performance of a finite-width FC network can be obtained by an infinite-width network, with a suitable choice of the Gaussian priors.
arXiv Detail & Related papers (2023-07-21T17:22:04Z)
Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules. inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks. We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z)
The Curious Case of Convex Neural Networks [12.56278477726461]
We show that the convexity constraints can be enforced on both fully connected and convolutional layers. We draw three valuable insights: (a) Input Output Convex Neural Networks (IOC-NNs) self regularize and reduce the problem of overfitting; (b) Although heavily constrained, they outperform the base multi layer perceptrons and achieve similar performance as compared to base convolutional architectures.
arXiv Detail & Related papers (2020-06-09T08:16:38Z)
CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks [0.0]
Lexical learning is modeled as emergent from an architecture that forces a deep neural network to output data. Networks trained on lexical items from TIMIT learn to encode unique information corresponding to lexical items in the form of categorical variables in their latent space. We show that phonetic and phonological representations learned by the network can be productively recombined and directly paralleled to productivity in human speech.
arXiv Detail & Related papers (2020-06-04T15:33:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.