Deep Graph Random Process for Relational-Thinking-Based Speech
Recognition
- URL: http://arxiv.org/abs/2007.02126v2
- Date: Wed, 8 Jul 2020 09:03:55 GMT
- Title: Deep Graph Random Process for Relational-Thinking-Based Speech
Recognition
- Authors: Hengguan Huang, Fuzhao Xue, Hao Wang, Ye Wang
- Abstract summary: relational thinking is characterized by relying on innumerable unconscious percepts pertaining to relations between new sensory signals and prior knowledge.
We present a Bayesian nonparametric deep learning method called deep graph random process (DGP) that can generate an infinite number of probabilistic graphs representing percepts.
Our approach is able to successfully infer relations among utterances without using any relational data during training.
- Score: 12.09786458466155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Lying at the core of human intelligence, relational thinking is characterized
by initially relying on innumerable unconscious percepts pertaining to
relations between new sensory signals and prior knowledge, consequently
becoming a recognizable concept or object through coupling and transformation
of these percepts. Such mental processes are difficult to model in real-world
problems such as in conversational automatic speech recognition (ASR), as the
percepts (if they are modelled as graphs indicating relationships among
utterances) are supposed to be innumerable and not directly observable. In this
paper, we present a Bayesian nonparametric deep learning method called deep
graph random process (DGP) that can generate an infinite number of
probabilistic graphs representing percepts. We further provide a closed-form
solution for coupling and transformation of these percept graphs for acoustic
modeling. Our approach is able to successfully infer relations among utterances
without using any relational data during training. Experimental evaluations on
ASR tasks including CHiME-2 and CHiME-5 demonstrate the effectiveness and
benefits of our method.
Related papers
- A Joint Spectro-Temporal Relational Thinking Based Acoustic Modeling Framework [10.354955365036181]
Despite the crucial role relational thinking plays in human understanding of speech, it has yet to be leveraged in any artificial speech recognition systems.
This paper presents a novel spectro-temporal relational thinking based acoustic modeling framework.
Models built upon this framework outperform stateof-the-art systems with a 7.82% improvement in phoneme recognition tasks over the TIMIT dataset.
arXiv Detail & Related papers (2024-09-17T05:45:33Z) - Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms.
We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z) - GIF: A General Graph Unlearning Strategy via Influence Function [63.52038638220563]
Graph Influence Function (GIF) is a model-agnostic unlearning method that can efficiently and accurately estimate parameter changes in response to a $epsilon$-mass perturbation in deleted data.
We conduct extensive experiments on four representative GNN models and three benchmark datasets to justify GIF's superiority in terms of unlearning efficacy, model utility, and unlearning efficiency.
arXiv Detail & Related papers (2023-04-06T03:02:54Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - Does the Brain Infer Invariance Transformations from Graph Symmetries? [0.0]
The invariance of natural objects under perceptual changes is possibly encoded in the brain by symmetries in the graph of synaptic connections.
The graph can be established via unsupervised learning in a biologically plausible process across different perceptual modalities.
arXiv Detail & Related papers (2021-11-11T12:35:13Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Exploiting Emotional Dependencies with Graph Convolutional Networks for
Facial Expression Recognition [31.40575057347465]
This paper proposes a novel multi-task learning framework to recognize facial expressions in-the-wild.
A shared feature representation is learned for both discrete and continuous recognition in a MTL setting.
The results of our experiments show that our method outperforms the current state-of-the-art methods on discrete FER.
arXiv Detail & Related papers (2021-06-07T10:20:05Z) - Neural-Symbolic Relational Reasoning on Graph Models: Effective Link
Inference and Computation from Knowledge Bases [0.5669790037378094]
We propose a neural-symbolic graph which applies learning over all the paths by feeding the model with the embedding of the minimal network of the knowledge graph containing such paths.
By learning to produce representations for entities and facts corresponding to word embeddings, we show how the model can be trained end-to-end to decode these representations and infer relations between entities in a relational approach.
arXiv Detail & Related papers (2020-05-05T22:46:39Z) - Facial Action Unit Intensity Estimation via Semantic Correspondence
Learning with Dynamic Graph Convolution [27.48620879003556]
We present a new learning framework that automatically learns the latent relationships of AUs via establishing semantic correspondences between feature maps.
In the heatmap regression-based network, feature maps preserve rich semantic information associated with AU intensities and locations.
This motivates us to model the correlation among feature channels, which implicitly represents the co-occurrence relationship of AU intensity levels.
arXiv Detail & Related papers (2020-04-20T23:55:30Z) - Continuous Emotion Recognition via Deep Convolutional Autoencoder and
Support Vector Regressor [70.2226417364135]
It is crucial that the machine should be able to recognize the emotional state of the user with high accuracy.
Deep neural networks have been used with great success in recognizing emotions.
We present a new model for continuous emotion recognition based on facial expression recognition.
arXiv Detail & Related papers (2020-01-31T17:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.