Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
- URL: http://arxiv.org/abs/2208.05782v1
- Date: Wed, 10 Aug 2022 06:56:58 GMT
- Title: Comparison and Analysis of New Curriculum Criteria for End-to-End ASR
- Authors: Georgios Karakasidis, Tam\'as Gr\'osz, Mikko Kurimo
- Abstract summary: Curriculum Learning is built on the observation that organized and structured assimilation of knowledge has the ability to enable faster training and better comprehension.
We employ Curriculum Learning in the context of Automatic Speech Recognition.
To impose structure on the training set, we explored multiple scoring functions that either use feedback from an external neural network or incorporate feedback from the model itself.
- Score: 10.698093106994804
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is common knowledge that the quantity and quality of the training data
play a significant role in the creation of a good machine learning model. In
this paper, we take it one step further and demonstrate that the way the
training examples are arranged is also of crucial importance. Curriculum
Learning is built on the observation that organized and structured assimilation
of knowledge has the ability to enable faster training and better
comprehension. When humans learn to speak, they first try to utter basic phones
and then gradually move towards more complex structures such as words and
sentences. This methodology is known as Curriculum Learning, and we employ it
in the context of Automatic Speech Recognition. We hypothesize that end-to-end
models can achieve better performance when provided with an organized training
set consisting of examples that exhibit an increasing level of difficulty (i.e.
a curriculum). To impose structure on the training set and to define the notion
of an easy example, we explored multiple scoring functions that either use
feedback from an external neural network or incorporate feedback from the model
itself. Empirical results show that with different curriculums we can balance
the training times and the network's performance.
Related papers
- Punctuation Restoration Improves Structure Understanding without
Supervision [6.4736137270915215]
We show that punctuation restoration as a learning objective improves in- and out-of-distribution performance on structure-related tasks.
Punctuation restoration is an effective learning objective that can improve structure understanding and yield a more robust structure-aware representations of natural language.
arXiv Detail & Related papers (2024-02-13T11:22:52Z) - A Quantitative Approach to Predicting Representational Learning and
Performance in Neural Networks [5.544128024203989]
Key property of neural networks is how they learn to represent and manipulate input information in order to solve a task.
We introduce a new pseudo-kernel based tool for analyzing and predicting learned representations.
arXiv Detail & Related papers (2023-07-14T18:39:04Z) - The Learnability of In-Context Learning [16.182561312622315]
We propose a first-of-its-kind PAC based framework for in-context learnability.
Our framework includes an initial pretraining phase, which fits a function to the pretraining distribution.
We show that in-context learning is more about identifying the task than about learning it.
arXiv Detail & Related papers (2023-03-14T13:28:39Z) - An Analytical Theory of Curriculum Learning in Teacher-Student Networks [10.303947049948107]
In humans and animals, curriculum learning is critical to rapid learning and effective pedagogy.
In machine learning, curricula are not widely used and empirically often yield only moderate benefits.
arXiv Detail & Related papers (2021-06-15T11:48:52Z) - Curriculum Learning: A Survey [65.31516318260759]
Curriculum learning strategies have been successfully employed in all areas of machine learning.
We construct a taxonomy of curriculum learning approaches by hand, considering various classification criteria.
We build a hierarchical tree of curriculum learning methods using an agglomerative clustering algorithm.
arXiv Detail & Related papers (2021-01-25T20:08:32Z) - Enhancing Dialogue Generation via Multi-Level Contrastive Learning [57.005432249952406]
We propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query.
A Rank-aware (RC) network is designed to construct the multi-level contrastive optimization objectives.
We build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words.
arXiv Detail & Related papers (2020-09-19T02:41:04Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z) - A Competence-aware Curriculum for Visual Concepts Learning via Question
Answering [95.35905804211698]
We propose a competence-aware curriculum for visual concept learning in a question-answering manner.
We design a neural-symbolic concept learner for learning the visual concepts and a multi-dimensional Item Response Theory (mIRT) model for guiding the learning process.
Experimental results on CLEVR show that with a competence-aware curriculum, the proposed method achieves state-of-the-art performances.
arXiv Detail & Related papers (2020-07-03T05:08:09Z) - Instance-Based Learning of Span Representations: A Case Study through
Named Entity Recognition [48.06319154279427]
We present a method of instance-based learning that learns similarities between spans.
Our method enables to build models that have high interpretability without sacrificing performance.
arXiv Detail & Related papers (2020-04-29T23:32:42Z) - Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning.
By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning.
This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.