Flexible model composition in machine learning and its implementation in
MLJ
- URL: http://arxiv.org/abs/2012.15505v1
- Date: Thu, 31 Dec 2020 08:49:43 GMT
- Title: Flexible model composition in machine learning and its implementation in
MLJ
- Authors: Anthony D. Blaom and Sebastian J. Vollmer
- Abstract summary: A graph-based protocol called learning networks' which combine assorted machine learning models into meta-models is described.
It is shown that learning networks are are sufficiently flexible to include Wolpert's model stacking, with out-of-sample predictions for the base learners.
- Score: 1.1091975655053545
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A graph-based protocol called `learning networks' which combine assorted
machine learning models into meta-models is described. Learning networks are
shown to overcome several limitations of model composition as implemented in
the dominant machine learning platforms. After illustrating the protocol in
simple examples, a concise syntax for specifying a learning network,
implemented in the MLJ framework, is presented. Using the syntax, it is shown
that learning networks are are sufficiently flexible to include Wolpert's model
stacking, with out-of-sample predictions for the base learners.
Related papers
- tn4ml: Tensor Network Training and Customization for Machine Learning [0.8799686507544172]
tn4ml is a novel library designed to seamlessly integrate Networks into Machine Learning tasks.
Inspired by existing Machine Learning frameworks, the library offers a user-friendly structure with modules for data embedding, objective function definition, and model training.
arXiv Detail & Related papers (2025-02-18T17:57:29Z) - DiagrammaticLearning: A Graphical Language for Compositional Training Regimes [39.26058251942536]
A learning diagram compiles to a unique loss function on which component models are trained.
We show that a number of popular learning setups can be depicted as learning diagrams.
arXiv Detail & Related papers (2025-01-02T19:44:36Z) - Aggregated f-average Neural Network for Interpretable Ensembling [25.818919790407016]
We introduce an aggregated f-average (AFA) shallow neural network which models and combines different types of averages to perform an optimal aggregation of the weak learners predictions.
We emphasise its interpretable architecture and simple training strategy, and illustrate its good performance on the problem of few-shot class incremental learning.
arXiv Detail & Related papers (2023-10-09T09:43:08Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - Knowledge Transfer For On-Device Speech Emotion Recognition with Neural
Structured Learning [19.220263739291685]
Speech emotion recognition (SER) has been a popular research topic in human-computer interaction (HCI)
We propose a neural structured learning (NSL) framework through building synthesized graphs.
Our experiments demonstrate that training a lightweight SER model on the target dataset with speech samples and graphs can not only produce small SER models, but also enhance the model performance.
arXiv Detail & Related papers (2022-10-26T18:38:42Z) - An Expectation-Maximization Perspective on Federated Learning [75.67515842938299]
Federated learning describes the distributed training of models across multiple clients while keeping the data private on-device.
In this work, we view the server-orchestrated federated learning process as a hierarchical latent variable model where the server provides the parameters of a prior distribution over the client-specific model parameters.
We show that with simple Gaussian priors and a hard version of the well known Expectation-Maximization (EM) algorithm, learning in such a model corresponds to FedAvg, the most popular algorithm for the federated learning setting.
arXiv Detail & Related papers (2021-11-19T12:58:59Z) - CARLS: Cross-platform Asynchronous Representation Learning System [24.96062146968367]
We propose CARLS, a novel framework for augmenting the capacity of existing deep learning frameworks.
We describe three learning paradigms that can be scaled up efficiently by CARLS.
arXiv Detail & Related papers (2021-05-26T21:19:02Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Explanation-Guided Training for Cross-Domain Few-Shot Classification [96.12873073444091]
Cross-domain few-shot classification task (CD-FSC) combines few-shot classification with the requirement to generalize across domains represented by datasets.
We introduce a novel training approach for existing FSC models.
We show that explanation-guided training effectively improves the model generalization.
arXiv Detail & Related papers (2020-07-17T07:28:08Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.