Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective
- URL: http://arxiv.org/abs/2406.11249v1
- Date: Mon, 17 Jun 2024 06:20:39 GMT
- Title: Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective
- Authors: Yang Chen, Cong Fang, Zhouchen Lin, Bing Liu,
- Abstract summary: We introduce a mathematical model that formalizes relational learning as hypergraph recovery to study pre-training of Foundation Models (FMs)
In our framework, the world is represented as a hypergraph, with data abstracted as random samples from hyperedges. We theoretically examine the feasibility of a Pre-Trained Model (PTM) to recover this hypergraph and analyze the data efficiency in a minimax near-optimal style.
- Score: 60.64922606733441
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Foundation Models (FMs) have demonstrated remarkable insights into the relational dynamics of the world, leading to the crucial question: how do these models acquire an understanding of world hybrid relations? Traditional statistical learning, particularly for prediction problems, may overlook the rich and inherently structured information from the data, especially regarding the relationships between objects. We introduce a mathematical model that formalizes relational learning as hypergraph recovery to study pre-training of FMs. In our framework, the world is represented as a hypergraph, with data abstracted as random samples from hyperedges. We theoretically examine the feasibility of a Pre-Trained Model (PTM) to recover this hypergraph and analyze the data efficiency in a minimax near-optimal style. By integrating rich graph theories into the realm of PTMs, our mathematical framework offers powerful tools for an in-depth understanding of pre-training from a unique perspective and can be used under various scenarios. As an example, we extend the framework to entity alignment in multimodal learning.
Related papers
- Two-dimensional Taxonomy for N-ary Knowledge Representation Learning Methods [0.12289361708127876]
This survey provides a comprehensive review of methods handling n-ary relational data, covering both knowledge hypergraphs and hyper-relational knowledge graphs literatures.<n>We propose a two-dimensional taxonomy: the first dimension categorises models based on their methodology, i.e., translation-based models, deep neural network-based models, logic rules-based models, and hyperedge expansion-based models.<n>The second dimension classifies models according to their awareness of entity roles and positions in n-ary relations, dividing them into aware-less, position-aware, and role-aware approaches.
arXiv Detail & Related papers (2025-06-05T22:59:39Z) - Generalization Performance of Hypergraph Neural Networks [21.483543928698676]
We develop margin-based generalization bounds for four representative classes of hypergraph neural networks.
Our results reveal the manner in which hypergraph structure and spectral norms of the learned weights can affect the generalization bounds.
Our empirical study examines the relationship between the practical performance and theoretical bounds of the models over synthetic and real-world datasets.
arXiv Detail & Related papers (2025-01-22T00:20:26Z) - Big Cooperative Learning [7.958840888809145]
We show that the training of foundation models can be interpreted as a form of big cooperative learning.
We propose the BigLearn-GAN, which is a novel adversarially-trained foundation model with versatile data sampling capabilities.
arXiv Detail & Related papers (2024-07-31T03:59:14Z) - Learning Discrete Concepts in Latent Hierarchical Models [73.01229236386148]
Learning concepts from natural high-dimensional data holds potential in building human-aligned and interpretable machine learning models.
We formalize concepts as discrete latent causal variables that are related via a hierarchical causal model.
We substantiate our theoretical claims with synthetic data experiments.
arXiv Detail & Related papers (2024-06-01T18:01:03Z) - Beyond DAGs: A Latent Partial Causal Model for Multimodal Learning [80.44084021062105]
We propose a novel latent partial causal model for multimodal data, featuring two latent coupled variables, connected by an undirected edge, to represent the transfer of knowledge across modalities.<n>Under specific statistical assumptions, we establish an identifiability result, demonstrating that representations learned by multimodal contrastive learning correspond to the latent coupled variables up to a trivial transformation.<n>Experiments on a pre-trained CLIP model embodies disentangled representations, enabling few-shot learning and improving domain generalization across diverse real-world datasets.
arXiv Detail & Related papers (2024-02-09T07:18:06Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - Generative learning for nonlinear dynamics [7.6146285961466]
generative machine learning models create realistic outputs far beyond their training data.
These successes suggest that generative models learn to effectively parametrize and sample arbitrarily complex distributions.
We aim to connect these classical works to emerging themes in large-scale generative statistical learning.
arXiv Detail & Related papers (2023-11-07T16:53:56Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - Towards a mathematical understanding of learning from few examples with
nonlinear feature maps [68.8204255655161]
We consider the problem of data classification where the training set consists of just a few data points.
We reveal key relationships between the geometry of an AI model's feature space, the structure of the underlying data distributions, and the model's generalisation capabilities.
arXiv Detail & Related papers (2022-11-07T14:52:58Z) - Learning Differential Operators for Interpretable Time Series Modeling [34.32259687441212]
We propose a learning framework that can automatically obtain interpretable PDE models from sequential data.
Our model can provide valuable interpretability and achieve comparable performance to state-of-the-art models.
arXiv Detail & Related papers (2022-09-03T20:14:31Z) - Bias-inducing geometries: an exactly solvable data model with fairness
implications [13.690313475721094]
We introduce an exactly solvable high-dimensional model of data imbalance.
We analytically unpack the typical properties of learning models trained in this synthetic framework.
We obtain exact predictions for the observables that are commonly employed for fairness assessment.
arXiv Detail & Related papers (2022-05-31T16:27:57Z) - Hyperbolic Graph Learning: A Comprehensive Review [56.53820115624101]
This survey paper provides a comprehensive review of the rapidly evolving field of Hyperbolic Graph Learning (HGL)<n>We systematically categorize and analyze existing methods dividing them into (1) hyperbolic graph embedding-based techniques, (2) graph neural network-based hyperbolic models, and (3) emerging paradigms.<n>We extensively discuss diverse applications of HGL across multiple domains, including recommender systems, knowledge graphs, bioinformatics, and other relevant scenarios.
arXiv Detail & Related papers (2022-02-28T15:08:48Z) - PPKE: Knowledge Representation Learning by Path-based Pre-training [43.41597219004598]
We propose a Path-based Pre-training model to learn Knowledge Embeddings, called PPKE.
Our model achieves state-of-the-art results on several benchmark datasets for link prediction and relation prediction tasks.
arXiv Detail & Related papers (2020-12-07T10:29:30Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.