From Charts to Atlas: Merging Latent Spaces into One
- URL: http://arxiv.org/abs/2311.06547v1
- Date: Sat, 11 Nov 2023 11:51:41 GMT
- Title: From Charts to Atlas: Merging Latent Spaces into One
- Authors: Donato Crisostomi, Irene Cannistraci, Luca Moschella, Pietro Barbiero,
Marco Ciccone, Pietro Li\`o, Emanuele Rodol\`a
- Abstract summary: Models trained on semantically related datasets and tasks exhibit comparable inter-sample relations within their latent spaces.
We introduce Relative Latent Space Aggregation, a two-step approach that first renders the spaces comparable using relative representations, and then aggregates them via a simple mean.
We compare the aggregated space with that derived from an end-to-end model trained over all tasks and show that the two spaces are similar.
- Score: 15.47502439734611
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Models trained on semantically related datasets and tasks exhibit comparable
inter-sample relations within their latent spaces. We investigate in this study
the aggregation of such latent spaces to create a unified space encompassing
the combined information. To this end, we introduce Relative Latent Space
Aggregation, a two-step approach that first renders the spaces comparable using
relative representations, and then aggregates them via a simple mean. We
carefully divide a classification problem into a series of learning tasks under
three different settings: sharing samples, classes, or neither. We then train a
model on each task and aggregate the resulting latent spaces. We compare the
aggregated space with that derived from an end-to-end model trained over all
tasks and show that the two spaces are similar. We then observe that the
aggregated space is better suited for classification, and empirically
demonstrate that it is due to the unique imprints left by task-specific
embedders within the representations. We finally test our framework in
scenarios where no shared region exists and show that it can still be used to
merge the spaces, albeit with diminished benefits over naive merging.
Related papers
- Learning from Semi-Factuals: A Debiased and Semantic-Aware Framework for
Generalized Relation Discovery [12.716874398564482]
Generalized Relation Discovery (GRD) aims to identify unlabeled instances in existing pre-defined relations or discover novel relations.
We propose a novel framework, SFGRD, for this task by learning from semi-factuals in two stages.
SFGRD surpasses state-of-the-art models in terms of accuracy by 2.36% $sim$5.78% and cosine similarity by 32.19%$sim$ 84.45%.
arXiv Detail & Related papers (2024-01-12T02:38:55Z) - Concrete Subspace Learning based Interference Elimination for Multi-task
Model Fusion [86.6191592951269]
Merging models fine-tuned from common extensively pretrained large model but specialized for different tasks has been demonstrated as a cheap and scalable strategy to construct a multitask model that performs well across diverse tasks.
We propose the CONtinuous relaxation dis (Concrete) subspace learning method to identify a common lowdimensional subspace and utilize its shared information track interference problem without sacrificing performance.
arXiv Detail & Related papers (2023-12-11T07:24:54Z) - Merging by Matching Models in Task Parameter Subspaces [87.8712523378141]
Model merging aims to cheaply combine individual task-specific models into a single multitask model.
We formalize how this approach to model merging can be seen as solving a linear system of equations.
We show that using the conjugate gradient method can outperform closed-form solutions.
arXiv Detail & Related papers (2023-12-07T14:59:15Z) - Multi-view hierarchical Variational AutoEncoders with Factor Analysis
latent space [67.60224656603823]
We propose a novel method to combine multiple Variational AutoEncoders with a Factor Analysis latent space.
We create an interpretable hierarchical dependency between private and shared information.
This way, the novel model is able to simultaneously: (i) learn from multiple heterogeneous views, (ii) obtain an interpretable hierarchical shared space, and (iii) perform transfer learning between generative models.
arXiv Detail & Related papers (2022-07-19T10:46:02Z) - Embracing Structure in Data for Billion-Scale Semantic Product Search [14.962039276966319]
We present principled approaches to train and deploy dyadic neural embedding models at the billion scale.
We show that exploiting the natural structure of real-world datasets helps address both challenges efficiently.
arXiv Detail & Related papers (2021-10-12T16:14:13Z) - Modelling Neighbor Relation in Joint Space-Time Graph for Video
Correspondence Learning [53.74240452117145]
This paper presents a self-supervised method for learning reliable visual correspondence from unlabeled videos.
We formulate the correspondence as finding paths in a joint space-time graph, where nodes are grid patches sampled from frames, and are linked by two types of edges.
Our learned representation outperforms the state-of-the-art self-supervised methods on a variety of visual tasks.
arXiv Detail & Related papers (2021-09-28T05:40:01Z) - UniRE: A Unified Label Space for Entity Relation Extraction [67.53850477281058]
Joint entity relation extraction models setup two separated label spaces for the two sub-tasks.
We argue that this setting may hinder the information interaction between entities and relations.
In this work, we propose to eliminate the different treatment on the two sub-tasks' label spaces.
arXiv Detail & Related papers (2021-07-09T08:09:37Z) - Cycle Registration in Persistent Homology with Applications in
Topological Bootstrap [0.0]
We propose a novel approach for comparing the persistent homology representations of two spaces (filtrations)
We do so by defining a correspondence relation between individual persistent cycles of two different spaces.
Our matching of cycles is based on both the persistence intervals and the spatial placement of each feature.
arXiv Detail & Related papers (2021-01-03T20:12:00Z) - A Boundary Based Out-of-Distribution Classifier for Generalized
Zero-Shot Learning [83.1490247844899]
Generalized Zero-Shot Learning (GZSL) is a challenging topic that has promising prospects in many realistic scenarios.
We propose a boundary based Out-of-Distribution (OOD) classifier which classifies the unseen and seen domains by only using seen samples for training.
We extensively validate our approach on five popular benchmark datasets including AWA1, AWA2, CUB, FLO and SUN.
arXiv Detail & Related papers (2020-08-09T11:27:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.