Related papers: DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

URL: http://arxiv.org/abs/2501.01515v1
Date: Thu, 02 Jan 2025 19:44:36 GMT
Title: DiagrammaticLearning: A Graphical Language for Compositional Training Regimes
Authors: Mason Lary, Richard Samuelson, Alexander Wilentz, Alina Zare, Matthew Klawonn, James P. Fairbanks,
Abstract summary: A learning diagram compiles to a unique loss function on which component models are trained.<n>We show that a number of popular learning setups can be depicted as learning diagrams.
Score: 39.26058251942536
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Motivated by deep learning regimes with multiple interacting yet distinct model components, we introduce learning diagrams, graphical depictions of training setups that capture parameterized learning as data rather than code. A learning diagram compiles to a unique loss function on which component models are trained. The result of training on this loss is a collection of models whose predictions ``agree" with one another. We show that a number of popular learning setups such as few-shot multi-task learning, knowledge distillation, and multi-modal learning can be depicted as learning diagrams. We further implement learning diagrams in a library that allows users to build diagrams of PyTorch and Flux.jl models. By implementing some classic machine learning use cases, we demonstrate how learning diagrams allow practitioners to build complicated models as compositions of smaller components, identify relationships between workflows, and manipulate models during or after training. Leveraging a category theoretic framework, we introduce a rigorous semantics for learning diagrams that puts such operations on a firm mathematical foundation.

Related papers

Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning [63.850451635362425]
Continual learning requires a model to adapt to ongoing changes in the data distribution. We show that the combination of a large language model and an image generation model can similarly provide useful premonitions. We find that the backbone of our pre-trained networks can learn representations useful for the downstream continual learning problem.
arXiv Detail & Related papers (2024-03-12T06:29:54Z)
Sequential Modeling Enables Scalable Learning for Large Vision Models [120.91839619284431]
We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data. We define a common format, "visual sentences", in which we can represent raw images and videos as well as annotated data sources.
arXiv Detail & Related papers (2023-12-01T18:59:57Z)
Class-level Structural Relation Modelling and Smoothing for Visual Representation Learning [12.247343963572732]
This paper presents a framework termed bfClass-level Structural Relation Modeling and Smoothing for Visual Representation Learning (CSRMS) It includes the Class-level Relation Modelling, Class-aware GraphGuided Sampling, and Graph-Guided Representation Learning modules. Experiments demonstrate the effectiveness of structured knowledge modelling for enhanced representation learning and show that CSRMS can be incorporated with any state-of-the-art visual representation learning models for performance gains.
arXiv Detail & Related papers (2023-08-08T09:03:46Z)
Distilling Knowledge from Self-Supervised Teacher by Embedding Graph Alignment [52.704331909850026]
We formulate a new knowledge distillation framework to transfer the knowledge from self-supervised pre-trained models to any other student network. Inspired by the spirit of instance discrimination in self-supervised learning, we model the instance-instance relations by a graph formulation in the feature embedding space. Our distillation scheme can be flexibly applied to transfer the self-supervised knowledge to enhance representation learning on various student networks.
arXiv Detail & Related papers (2022-11-23T19:27:48Z)
An Algebraic Framework for Stock & Flow Diagrams and Dynamical Systems Using Category Theory [2.030738254233949]
In this chapter, rather than focusing on the underlying mathematics, we informally use communicable disease examples created by the implemented software of StockFlow.jl. We first characterize categorical stock & flow diagrams, and note the clear separation between the syntax of stock & flow diagrams and their semantics. Applying category theory, these frameworks can build large diagrams from smaller ones in a modular fashion.
arXiv Detail & Related papers (2022-11-01T16:15:54Z)
Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks [53.09649785009528]
In this paper, we explore a paradigm that does not require training to obtain new models. Similar to the birth of CNN inspired by receptive fields in the biological visual system, we propose Model Disassembling and Assembling. For model assembling, we present the alignment padding strategy and parameter scaling strategy to construct a new model tailored for a specific task.
arXiv Detail & Related papers (2022-03-25T05:27:28Z)
Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning. We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels. Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z)
Flexible model composition in machine learning and its implementation in MLJ [1.1091975655053545]
A graph-based protocol called learning networks' which combine assorted machine learning models into meta-models is described. It is shown that learning networks are are sufficiently flexible to include Wolpert's model stacking, with out-of-sample predictions for the base learners.
arXiv Detail & Related papers (2020-12-31T08:49:43Z)
Small-Group Learning, with Application to Neural Architecture Search [17.86826990290058]
In human learning, a small group of students work together towards the same learning objective, where they express their understanding of a topic to their peers, compare their ideas, and help each other to trouble-shoot problems. In this paper, we aim to investigate whether this human learning method can be borrowed to train better machine learning models, by developing a novel ML framework -- small-group learning (SGL) SGL is formulated as a multi-level optimization framework consisting of three learning stages: each learner trains a model independently and uses this model to perform pseudo-labeling; each learner trains another model using datasets pseudo-
arXiv Detail & Related papers (2020-12-23T05:56:47Z)
Towards Interpretable Multi-Task Learning Using Bilevel Programming [18.293397644865454]
Interpretable Multi-Task Learning can be expressed as learning a sparse graph of the task relationship based on the prediction performance of the learned models. We show empirically how the induced sparse graph improves the interpretability of the learned models and their relationship on synthetic and real data, without sacrificing generalization performance.
arXiv Detail & Related papers (2020-09-11T15:04:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.