A Deep Joint Sparse Non-negative Matrix Factorization Framework for
Identifying the Common and Subject-specific Functional Units of Tongue Motion
During Speech
- URL: http://arxiv.org/abs/2007.04865v2
- Date: Sun, 6 Jun 2021 23:10:25 GMT
- Title: A Deep Joint Sparse Non-negative Matrix Factorization Framework for
Identifying the Common and Subject-specific Functional Units of Tongue Motion
During Speech
- Authors: Jonghye Woo, Fangxu Xing, Jerry L. Prince, Maureen Stone, Arnold
Gomez, Timothy G. Reese, Van J. Wedeen, Georges El Fakhri
- Abstract summary: We develop a new deep learning framework to identify common and subject-specific functional units of tongue motion during speech.
We transform NMF with sparse and graph regularizations into modular architectures akin to deep neural networks.
- Score: 7.870139900799612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Intelligible speech is produced by creating varying internal local muscle
groupings -- i.e., functional units -- that are generated in a systematic and
coordinated manner. There are two major challenges in characterizing and
analyzing functional units.~First, due to the complex and convoluted nature of
tongue structure and function, it is of great importance to develop a method
that can accurately decode complex muscle coordination patterns during speech.
Second, it is challenging to keep identified functional units across subjects
comparable due to their substantial variability. In this work, to address these
challenges, we develop a new deep learning framework to identify common and
subject-specific functional units of tongue motion during speech.~Our framework
hinges on joint deep graph-regularized sparse non-negative matrix factorization
(NMF) using motion quantities derived from displacements by tagged Magnetic
Resonance Imaging. More specifically, we transform NMF with sparse and graph
regularizations into modular architectures akin to deep neural networks by
means of unfolding the Iterative Shrinkage-Thresholding Algorithm to learn
interpretable building blocks and associated weighting map. We then apply
spectral clustering to common and subject-specific weighting maps from which we
jointly determine the common and subject-specific functional units. Experiments
carried out with simulated datasets show that the proposed method achieved on
par or better clustering performance over the comparison methods. Experiments
carried out with in vivo tongue motion data show that the proposed method can
determine the common and subject-specific functional units with increased
interpretability and decreased size variability.
Related papers
- GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - Investigating semantic subspaces of Transformer sentence embeddings
through linear structural probing [2.5002227227256864]
We present experiments with semantic structural probing, a method for studying sentence-level representations.
We apply our method to language models from different families (encoder-only, decoder-only, encoder-decoder) and of different sizes in the context of two tasks.
We find that model families differ substantially in their performance and layer dynamics, but that the results are largely model-size invariant.
arXiv Detail & Related papers (2023-10-18T12:32:07Z) - A Mechanistic Interpretation of Arithmetic Reasoning in Language Models
using Causal Mediation Analysis [128.0532113800092]
We present a mechanistic interpretation of Transformer-based LMs on arithmetic questions.
This provides insights into how information related to arithmetic is processed by LMs.
arXiv Detail & Related papers (2023-05-24T11:43:47Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Clustering units in neural networks: upstream vs downstream information [3.222802562733787]
We study modularity of hidden layer representations of feedforward, fully connected networks.
We find two surprising results: first, dropout dramatically increased modularity, while other forms of weight regularization had more modest effects.
This has important implications for representation-learning, as it suggests that finding modular representations that reflect structure in inputs may be a distinct goal from learning modular representations that reflect structure in outputs.
arXiv Detail & Related papers (2022-03-22T15:35:10Z) - Adaptive Discrete Communication Bottlenecks with Dynamic Vector
Quantization [76.68866368409216]
We propose learning to dynamically select discretization tightness conditioned on inputs.
We show that dynamically varying tightness in communication bottlenecks can improve model performance on visual reasoning and reinforcement learning tasks.
arXiv Detail & Related papers (2022-02-02T23:54:26Z) - Inductive Biases and Variable Creation in Self-Attention Mechanisms [25.79946667926312]
This work provides a theoretical analysis of the inductive biases of self-attention modules.
Our focus is to rigorously establish which functions and long-range dependencies self-attention blocks prefer to represent.
Our main result shows that bounded-norm Transformer layers create sparse variables.
arXiv Detail & Related papers (2021-10-19T16:36:19Z) - Scalable Gaussian Processes for Data-Driven Design using Big Data with
Categorical Factors [14.337297795182181]
Gaussian processes (GP) have difficulties in accommodating big datasets, categorical inputs, and multiple responses.
We propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously.
Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism.
arXiv Detail & Related papers (2021-06-26T02:17:23Z) - The role of feature space in atomistic learning [62.997667081978825]
Physically-inspired descriptors play a key role in the application of machine-learning techniques to atomistic simulations.
We introduce a framework to compare different sets of descriptors, and different ways of transforming them by means of metrics and kernels.
We compare representations built in terms of n-body correlations of the atom density, quantitatively assessing the information loss associated with the use of low-order features.
arXiv Detail & Related papers (2020-09-06T14:12:09Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.