Related papers: Comparing the information content of probabilistic representation spaces

Comparing the information content of probabilistic representation spaces

URL: http://arxiv.org/abs/2405.21042v2
Date: Mon, 21 Oct 2024 17:50:10 GMT
Title: Comparing the information content of probabilistic representation spaces
Authors: Kieran A. Murphy, Sam Dillavou, Dani S. Bassett,
Abstract summary: Probabilistic representation spaces convey information about a dataset, and to understand the effects of factors such as training loss and network architecture, we seek to compare the information content of such spaces. Here, instead of building upon point-based measures of comparison, we build upon classic methods from literature on hard clustering. We propose a practical method of estimation that is based on fingerprinting a representation space with a sample of the dataset and is applicable when the communicated information is only a handful of bits.
Score: 3.7277730514654555
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Probabilistic representation spaces convey information about a dataset, and to understand the effects of factors such as training loss and network architecture, we seek to compare the information content of such spaces. However, most existing methods to compare representation spaces assume representations are points, and neglect the distributional nature of probabilistic representations. Here, instead of building upon point-based measures of comparison, we build upon classic methods from literature on hard clustering. We generalize two information-theoretic methods of comparing hard clustering assignments to be applicable to general probabilistic representation spaces. We then propose a practical method of estimation that is based on fingerprinting a representation space with a sample of the dataset and is applicable when the communicated information is only a handful of bits. With unsupervised disentanglement as a motivating problem, we find information fragments that are repeatedly contained in individual latent dimensions in VAE and InfoGAN ensembles. Then, by comparing the full latent spaces of models, we find highly consistent information content across datasets, methods, and hyperparameters, even though there is often a point during training with substantial variety across repeat runs. Finally, we leverage the differentiability of the proposed method and perform model fusion by synthesizing the information content of multiple weak learners, each incapable of representing the global structure of a dataset. Across the case studies, the direct comparison of information content provides a natural basis for understanding the processing of information.

Related papers

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence [91.54446789584826]
Epiplexity is a formalization of information capturing what computationally bounded observers can learn from data.<n>We show how information can be created with computation, how it depends on the ordering of the data, and how likelihood modeling can produce more complex programs than present in the data generating process itself.
arXiv Detail & Related papers (2026-01-06T18:04:03Z)
Osmotic Learning: A Self-Supervised Paradigm for Decentralized Contextual Data Representation [1.5329374712396715]
This paper introduces OSM-L, a self-supervised distributed learning paradigm designed to uncover higher-level latent knowledge from distributed data.<n>OSM-L iteratively aligns local data representations, enabling information diffusion and convergence.<n> Experimental results confirm OSM-L's convergence and representation capabilities on structured datasets.
arXiv Detail & Related papers (2025-12-28T22:25:16Z)
Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance. DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator. Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z)
What is different between these datasets? [20.706111458944502]
Two datasets from the same domain may exhibit differing distributions. We propose a versatile toolbox of interpretable methods for comparing datasets. These methods complement existing techniques by providing actionable and interpretable insights.
arXiv Detail & Related papers (2024-03-08T19:52:39Z)
Learning Representations without Compositional Assumptions [79.12273403390311]
We propose a data-driven approach that learns feature set dependencies by representing feature sets as graph nodes and their relationships as learnable edges. We also introduce LEGATO, a novel hierarchical graph autoencoder that learns a smaller, latent graph to aggregate information from multiple views dynamically.
arXiv Detail & Related papers (2023-05-31T10:36:10Z)
infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization. infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information. In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z)
Inv-SENnet: Invariant Self Expression Network for clustering under biased data [17.25929452126843]
We propose a novel framework for jointly removing unwanted attributes (biases) while learning to cluster data points in individual subspaces. Our experimental result on synthetic and real-world datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-11-13T01:19:06Z)
FUNCK: Information Funnels and Bottlenecks for Invariant Representation Learning [7.804994311050265]
We investigate a set of related information funnels and bottleneck problems that claim to learn invariant representations from the data. We propose a new element to this family of information-theoretic objectives: The Conditional Privacy Funnel with Side Information. Given the generally intractable objectives, we derive tractable approximations using amortized variational inference parameterized by neural networks.
arXiv Detail & Related papers (2022-11-02T19:37:55Z)
Leachable Component Clustering [10.377914682543903]
In this work, a novel approach to clustering of incomplete data, termed leachable component clustering, is proposed. The proposed method handles data imputation with Bayes alignment, and collects the lost patterns in theory. Experiments on several artificial incomplete data sets demonstrate that, the proposed method is able to present superior performance compared with other state-of-the-art algorithms.
arXiv Detail & Related papers (2022-08-28T13:13:17Z)
Variational Distillation for Multi-View Learning [104.17551354374821]
We design several variational information bottlenecks to exploit two key characteristics for multi-view representation learning. Under rigorously theoretical guarantee, our approach enables IB to grasp the intrinsic correlation between observations and semantic labels.
arXiv Detail & Related papers (2022-06-20T03:09:46Z)
Leveraging Ensembles and Self-Supervised Learning for Fully-Unsupervised Person Re-Identification and Text Authorship Attribution [77.85461690214551]
Learning from fully-unlabeled data is challenging in Multimedia Forensics problems, such as Person Re-Identification and Text Authorship Attribution. Recent self-supervised learning methods have shown to be effective when dealing with fully-unlabeled data in cases where the underlying classes have significant semantic differences. We propose a strategy to tackle Person Re-Identification and Text Authorship Attribution by enabling learning from unlabeled data even when samples from different classes are not prominently diverse.
arXiv Detail & Related papers (2022-02-07T13:08:11Z)
Discriminative Supervised Subspace Learning for Cross-modal Retrieval [16.035973055257642]
We propose a discriminative supervised subspace learning for cross-modal retrieval(DS2L) Specifically, we first construct a shared semantic graph to preserve the semantic structure within each modality. We then introduce the Hilbert-Schmidt Independence Criterion(HSIC) to preserve the consistence between feature-similarity and semantic-similarity of samples.
arXiv Detail & Related papers (2022-01-26T14:27:39Z)
Learning Conditional Invariance through Cycle Consistency [60.85059977904014]
We propose a novel approach to identify meaningful and independent factors of variation in a dataset. Our method involves two separate latent subspaces for the target property and the remaining input information. We demonstrate on synthetic and molecular data that our approach identifies more meaningful factors which lead to sparser and more interpretable models.
arXiv Detail & Related papers (2021-11-25T17:33:12Z)
Learning Debiased and Disentangled Representations for Semantic Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation. By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes. Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z)
Learning Bias-Invariant Representation by Cross-Sample Mutual Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task. The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator. We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z)
Integrating Auxiliary Information in Self-supervised Learning [94.11964997622435]
We first observe that the auxiliary information may bring us useful information about data structures. We present to construct data clusters according to the auxiliary information. We show that Cl-InfoNCE may be a better approach to leverage the data clustering information.
arXiv Detail & Related papers (2021-06-05T11:01:15Z)
Contrastive analysis for scatter plot-based representations of dimensionality reduction [0.0]
This paper introduces a methodology to explore multidimensional datasets and interpret clusters' formation. We also introduce a bipartite graph to visually interpret and explore the relationship between the statistical variables used to understand how the attributes influenced cluster formation.
arXiv Detail & Related papers (2021-01-26T01:16:31Z)
Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations. Our framework well preserves the relations between samples. By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
Learning Discrete Structured Representations by Adversarially Maximizing Mutual Information [39.87273353895564]
We propose learning discrete structured representations from unlabeled data by maximizing the mutual information between a structured latent variable and a target variable. Our key technical contribution is an adversarial objective that can be used to tractably estimate mutual information assuming only the feasibility of cross entropy calculation. We apply our model on document hashing and show that it outperforms current best baselines based on discrete and vector quantized variational autoencoders.
arXiv Detail & Related papers (2020-04-08T13:31:53Z)
Learning Unbiased Representations via Mutual Information Backpropagation [36.383338079229695]
In particular, we face the case where some attributes (bias) of the data, if learned by the model, can severely compromise its generalization properties. We propose a novel end-to-end optimization strategy, which simultaneously estimates and minimizes the mutual information between the learned representation and the data attributes.
arXiv Detail & Related papers (2020-03-13T18:06:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.