Reconstruction of Pairwise Interactions using Energy-Based Models
- URL: http://arxiv.org/abs/2012.06625v1
- Date: Fri, 11 Dec 2020 20:15:10 GMT
- Title: Reconstruction of Pairwise Interactions using Energy-Based Models
- Authors: Christoph Feinauer, Carlo Lucibello
- Abstract summary: We show that hybrid models, which combine a pairwise model and a neural network, can lead to significant improvements in the reconstruction of pairwise interactions.
This is in line with the general idea that simple interpretable models and complex black-box models are not necessarily a dichotomy.
- Score: 3.553493344868414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pairwise models like the Ising model or the generalized Potts model have
found many successful applications in fields like physics, biology, and
economics. Closely connected is the problem of inverse statistical mechanics,
where the goal is to infer the parameters of such models given observed data.
An open problem in this field is the question of how to train these models in
the case where the data contain additional higher-order interactions that are
not present in the pairwise model. In this work, we propose an approach based
on Energy-Based Models and pseudolikelihood maximization to address these
complications: we show that hybrid models, which combine a pairwise model and a
neural network, can lead to significant improvements in the reconstruction of
pairwise interactions. We show these improvements to hold consistently when
compared to a standard approach using only the pairwise model and to an
approach using only a neural network. This is in line with the general idea
that simple interpretable models and complex black-box models are not
necessarily a dichotomy: interpolating these two classes of models can allow to
keep some advantages of both.
Related papers
- Exploring Model Kinship for Merging Large Language Models [52.01652098827454]
We introduce model kinship, the degree of similarity or relatedness between Large Language Models.
We find that there is a certain relationship between model kinship and the performance gains after model merging.
We propose a new model merging strategy: Top-k Greedy Merging with Model Kinship, which can yield better performance on benchmark datasets.
arXiv Detail & Related papers (2024-10-16T14:29:29Z) - Learnable & Interpretable Model Combination in Dynamic Systems Modeling [0.0]
We discuss which types of models are usually combined and propose a model interface that is capable of expressing a variety of mixed equation based models.
We propose a new wildcard topology, that is capable of describing the generic connection between two combined models in an easy to interpret fashion.
The contributions of this paper are highlighted at a proof of concept: Different connection topologies between two models are learned, interpreted and compared.
arXiv Detail & Related papers (2024-06-12T11:17:11Z) - Inferring effective couplings with Restricted Boltzmann Machines [3.150368120416908]
Generative models attempt to encode correlations observed in the data at the level of the Boltzmann weight associated with an energy function in the form of a neural network.
We propose a solution by implementing a direct mapping between the Restricted Boltzmann Machine and an effective Ising spin Hamiltonian.
arXiv Detail & Related papers (2023-09-05T14:55:09Z) - Understanding Parameter Sharing in Transformers [53.75988363281843]
Previous work on Transformers has focused on sharing parameters in different layers, which can improve the performance of models with limited parameters by increasing model depth.
We show that the success of this approach can be largely attributed to better convergence, with only a small part due to the increased model complexity.
Experiments on 8 machine translation tasks show that our model achieves competitive performance with only half the model complexity of parameter sharing models.
arXiv Detail & Related papers (2023-06-15T10:48:59Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Model-agnostic multi-objective approach for the evolutionary discovery
of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results.
We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z) - Towards a Better Understanding of Linear Models for Recommendation [28.422943262159933]
We derivation and analysis the closed-form solutions for two basic regression and matrix factorization approaches.
We introduce a new learning algorithm in searching (hyper) parameters for the closed-form solution.
The experimental results demonstrate that the basic models and their closed-form solutions are indeed quite competitive against the state-of-the-art models.
arXiv Detail & Related papers (2021-05-27T04:17:04Z) - A Twin Neural Model for Uplift [59.38563723706796]
Uplift is a particular case of conditional treatment effect modeling.
We propose a new loss function defined by leveraging a connection with the Bayesian interpretation of the relative risk.
We show our proposed method is competitive with the state-of-the-art in simulation setting and on real data from large scale randomized experiments.
arXiv Detail & Related papers (2021-05-11T16:02:39Z) - LowFER: Low-rank Bilinear Pooling for Link Prediction [4.110108749051657]
We propose a factorized bilinear pooling model, commonly used in multi-modal learning, for better fusion of entities and relations.
Our model naturally generalizes decomposition Tucker based TuckER model, which has been shown to generalize other models.
We evaluate on real-world datasets, reaching on par or state-of-the-art performance.
arXiv Detail & Related papers (2020-08-25T07:33:52Z) - When Ensembling Smaller Models is More Efficient than Single Large
Models [52.38997176317532]
We show that ensembles can outperform single models with both higher accuracy and requiring fewer total FLOPs to compute.
This presents an interesting observation that output diversity in ensembling can often be more efficient than training larger models.
arXiv Detail & Related papers (2020-05-01T18:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.