Probing Statistical Representations For End-To-End ASR
- URL: http://arxiv.org/abs/2211.01993v1
- Date: Thu, 3 Nov 2022 17:08:14 GMT
- Title: Probing Statistical Representations For End-To-End ASR
- Authors: Anna Ollerenshaw, Md Asif Jalal, Thomas Hain
- Abstract summary: This paper investigates cross-domain language model dependencies within transformer architectures using SVCCA.
It was found that specific neural representations within the transformer layers exhibit correlated behaviour which impacts recognition performance.
- Score: 28.833851817220616
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: End-to-End automatic speech recognition (ASR) models aim to learn a
generalised speech representation to perform recognition. In this domain there
is little research to analyse internal representation dependencies and their
relationship to modelling approaches. This paper investigates cross-domain
language model dependencies within transformer architectures using SVCCA and
uses these insights to exploit modelling approaches. It was found that specific
neural representations within the transformer layers exhibit correlated
behaviour which impacts recognition performance.
Altogether, this work provides analysis of the modelling approaches affecting
contextual dependencies and ASR performance, and can be used to create or adapt
better performing End-to-End ASR models and also for downstream tasks.
Related papers
- The Importance of Downstream Networks in Digital Pathology Foundation Models [1.689369173057502]
We evaluate seven feature extractor models across three different datasets with 162 different aggregation model configurations.
We find that the performance of many current feature extractor models is notably similar.
arXiv Detail & Related papers (2023-11-29T16:54:25Z) - Causal Disentangled Variational Auto-Encoder for Preference
Understanding in Recommendation [50.93536377097659]
This paper introduces the Causal Disentangled Variational Auto-Encoder (CaD-VAE), a novel approach for learning causal disentangled representations from interaction data in recommender systems.
The approach utilizes structural causal models to generate causal representations that describe the causal relationship between latent factors.
arXiv Detail & Related papers (2023-04-17T00:10:56Z) - Variable Importance Matching for Causal Inference [73.25504313552516]
We describe a general framework called Model-to-Match that achieves these goals.
Model-to-Match uses variable importance measurements to construct a distance metric.
We operationalize the Model-to-Match framework with LASSO.
arXiv Detail & Related papers (2023-02-23T00:43:03Z) - Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems.
Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored.
We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z) - Insights on Neural Representations for End-to-End Speech Recognition [28.833851817220616]
End-to-end automatic speech recognition (ASR) models aim to learn a generalised speech representation.
Previous investigations of network similarities using correlation analysis techniques have not been explored for End-to-End ASR models.
This paper analyses and explores the internal dynamics between layers during training with CNN, LSTM and Transformer based approaches.
arXiv Detail & Related papers (2022-05-19T10:19:32Z) - Temporal Relevance Analysis for Video Action Models [70.39411261685963]
We first propose a new approach to quantify the temporal relationships between frames captured by CNN-based action models.
We then conduct comprehensive experiments and in-depth analysis to provide a better understanding of how temporal modeling is affected.
arXiv Detail & Related papers (2022-04-25T19:06:48Z) - Layer-wise Analysis of a Self-supervised Speech Representation Model [26.727775920272205]
Self-supervised learning approaches have been successful for pre-training speech representation models.
Not much has been studied about the type or extent of information encoded in the pre-trained representations themselves.
arXiv Detail & Related papers (2021-07-10T02:13:25Z) - Evaluating the Impact of a Hierarchical Discourse Representation on
Entity Coreference Resolution Performance [3.7277082975620797]
In this work, we leverage automatically constructed discourse parse trees within a neural approach.
We demonstrate a significant improvement on two benchmark entity coreference-resolution datasets.
arXiv Detail & Related papers (2021-04-20T19:14:57Z) - Transforming Feature Space to Interpret Machine Learning Models [91.62936410696409]
This contribution proposes a novel approach that interprets machine-learning models through the lens of feature space transformations.
It can be used to enhance unconditional as well as conditional post-hoc diagnostic tools.
A case study on remote-sensing landcover classification with 46 features is used to demonstrate the potential of the proposed approach.
arXiv Detail & Related papers (2021-04-09T10:48:11Z) - Facial Action Unit Intensity Estimation via Semantic Correspondence
Learning with Dynamic Graph Convolution [27.48620879003556]
We present a new learning framework that automatically learns the latent relationships of AUs via establishing semantic correspondences between feature maps.
In the heatmap regression-based network, feature maps preserve rich semantic information associated with AU intensities and locations.
This motivates us to model the correlation among feature channels, which implicitly represents the co-occurrence relationship of AU intensity levels.
arXiv Detail & Related papers (2020-04-20T23:55:30Z) - Rethinking Generalization of Neural Models: A Named Entity Recognition
Case Study [81.11161697133095]
We take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives.
Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models.
As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers.
arXiv Detail & Related papers (2020-01-12T04:33:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.