Learning with Hidden Factorial Structure
- URL: http://arxiv.org/abs/2411.01375v3
- Date: Sun, 02 Feb 2025 22:25:57 GMT
- Title: Learning with Hidden Factorial Structure
- Authors: Charles Arnal, Clement Berenfeld, Simon Rosenberg, Vivien Cabannes,
- Abstract summary: Recent advances suggest that text and image data contain such hidden structures, which help mitigate the curse of dimensionality.
We present a controlled experimental framework to test whether neural networks can indeed exploit such "hidden factorial structures"
- Score: 2.474908349649168
- License:
- Abstract: Statistical learning in high-dimensional spaces is challenging without a strong underlying data structure. Recent advances with foundational models suggest that text and image data contain such hidden structures, which help mitigate the curse of dimensionality. Inspired by results from nonparametric statistics, we hypothesize that this phenomenon can be partially explained in terms of decomposition of complex tasks into simpler subtasks. In this paper, we present a controlled experimental framework to test whether neural networks can indeed exploit such "hidden factorial structures". We find that they do leverage these latent patterns to learn discrete distributions more efficiently. We also study the interplay between our structural assumptions and the models' capacity for generalization.
Related papers
- Shallow diffusion networks provably learn hidden low-dimensional structure [17.563546018565468]
Diffusion-based generative models provide a powerful framework for learning to sample from a complex target distribution.
We show that these models provably adapt to simple forms of low dimensional structure, thereby avoiding the curse of dimensionality.
We combine our results with recent analyses of sampling with diffusion models to provide an end-to-end sample complexity bound for learning to sample from structured distributions.
arXiv Detail & Related papers (2024-10-15T04:55:56Z) - Hardness of Learning Neural Networks under the Manifold Hypothesis [3.2635082758250693]
manifold hypothesis presumes that high-dimensional data lies on or near a low-dimensional manifold.
We investigate the hardness of learning under the manifold hypothesis.
We show that additional assumptions on the volume of the data manifold alleviate these fundamental limitations.
arXiv Detail & Related papers (2024-06-03T15:50:32Z) - ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models [65.82630283336051]
We show that the space spanned by the combination of dimensions and attributes is insufficiently sampled by existing training scheme of diffusion generative models.
We present a simple fix to this problem by constructing processes that fully exploit the structures, hence the name ComboStoc.
arXiv Detail & Related papers (2024-05-22T15:23:10Z) - Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure [66.33623392497599]
We show that a structure called template-content structure (T-C structure) can reduce the possible space from exponential level to linear level.
We demonstrate that models can achieve task composition, further reducing the space needed to learn from linear to logarithmic.
arXiv Detail & Related papers (2023-10-09T06:57:45Z) - Homological Convolutional Neural Networks [4.615338063719135]
We propose a novel deep learning architecture that exploits the data structural organization through topologically constrained network representations.
We test our model on 18 benchmark datasets against 5 classic machine learning and 3 deep learning models.
arXiv Detail & Related papers (2023-08-26T08:48:51Z) - Discrete Latent Structure in Neural Networks [32.41642110537956]
This text explores three broad strategies for learning with discrete latent structure.
We show how most consist of the same small set of fundamental building blocks, but use them differently, leading to substantially different applicability and properties.
arXiv Detail & Related papers (2023-01-18T12:30:44Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - On Neural Architecture Inductive Biases for Relational Tasks [76.18938462270503]
We introduce a simple architecture based on similarity-distribution scores which we name Compositional Network generalization (CoRelNet)
We find that simple architectural choices can outperform existing models in out-of-distribution generalizations.
arXiv Detail & Related papers (2022-06-09T16:24:01Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.