Explore and Exploit with Heterotic Line Bundle Models
- URL: http://arxiv.org/abs/2003.04817v1
- Date: Tue, 10 Mar 2020 15:49:33 GMT
- Title: Explore and Exploit with Heterotic Line Bundle Models
- Authors: Magdalena Larfors and Robin Schneider
- Abstract summary: We use deep reinforcement learning to explore a class of heterotic $SU(5)$ GUT models constructed from line bundle sums.
We perform several experiments where A3C agents are trained to search for such models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We use deep reinforcement learning to explore a class of heterotic $SU(5)$
GUT models constructed from line bundle sums over Complete Intersection Calabi
Yau (CICY) manifolds. We perform several experiments where A3C agents are
trained to search for such models. These agents significantly outperform random
exploration, in the most favourable settings by a factor of 1700 when it comes
to finding unique models. Furthermore, we find evidence that the trained agents
also outperform random walkers on new manifolds. We conclude that the agents
detect hidden structures in the compactification data, which is partly of
general nature. The experiments scale well with $h^{(1,1)}$, and may thus
provide the key to model building on CICYs with large $h^{(1,1)}$.
Related papers
- The Generative Leap: Sharp Sample Complexity for Efficiently Learning Gaussian Multi-Index Models [71.5283441529015]
In this work we consider generic Gaussian Multi-index models, in which the labels only depend on the (Gaussian) $d$-dimensional inputs through their projection onto a low-dimensional $r = O_d(1)$ subspace.<n>We introduce the generative leap exponent $kstar$, a natural extension of the generative exponent from [Damian et al.'24] to the multi-index setting.
arXiv Detail & Related papers (2025-06-05T18:34:56Z) - APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay [86.01901238059261]
APIGen-MT is a framework that generates verifiable and diverse multi-turn agent data.
We train a family of models -- the xLAM-2-fc-r series with sizes ranging from 1B to 70B parameters.
Our models outperform frontier models such as GPT-4o and Claude 3.5 on $tau$-bench and BFCL benchmarks.
arXiv Detail & Related papers (2025-04-04T17:13:57Z) - AstroM$^3$: A self-supervised multimodal model for astronomy [0.0]
We propose AstroM$3$, a self-supervised pre-training approach that enables a model to learn from multiple modalities simultaneously.
Specifically, we extend the CLIP (Contrastive Language-Image Pretraining) model to a trimodal setting, allowing the integration of time-series photometry data, spectra, and astrophysical metadata.
Results demonstrate that CLIP pre-training improves classification performance for time-series photometry, where accuracy increases from 84.6% to 91.5%.
arXiv Detail & Related papers (2024-11-13T18:20:29Z) - xLAM: A Family of Large Action Models to Empower AI Agent Systems [111.5719694445345]
We release xLAM, a series of large action models designed for AI agent tasks.
xLAM consistently delivers exceptional performance across multiple agent ability benchmarks.
arXiv Detail & Related papers (2024-09-05T03:22:22Z) - Machine Learning on generalized Complete Intersection Calabi-Yau
Manifolds [16.923362862181445]
Generalized Complete Intersection Calabi-Yau Manifold (gCICY) is a new construction of Calabi-Yau manifold.
In this paper, we try to make some progress in this direction using neural network.
arXiv Detail & Related papers (2022-09-21T07:30:07Z) - Deconstructing Distributions: A Pointwise Framework of Learning [15.517383696434162]
We study a point's $textitprofile$: the relationship between models' average performance on the test distribution and their pointwise performance on this individual point.
We find that profiles can yield new insights into the structure of both models and data -- in and out-of-distribution.
arXiv Detail & Related papers (2022-02-20T23:25:28Z) - Riemannian Score-Based Generative Modeling [56.20669989459281]
We introduce score-based generative models (SGMs) demonstrating remarkable empirical performance.
Current SGMs make the underlying assumption that the data is supported on a Euclidean manifold with flat geometry.
This prevents the use of these models for applications in robotics, geoscience or protein modeling.
arXiv Detail & Related papers (2022-02-06T11:57:39Z) - Residual Overfit Method of Exploration [78.07532520582313]
We propose an approximate exploration methodology based on fitting only two point estimates, one tuned and one overfit.
The approach drives exploration towards actions where the overfit model exhibits the most overfitting compared to the tuned model.
We compare ROME against a set of established contextual bandit methods on three datasets and find it to be one of the best performing.
arXiv Detail & Related papers (2021-10-06T17:05:33Z) - Heterotic String Model Building with Monad Bundles and Reinforcement
Learning [0.0]
We study heterotic SO GUT models on Calabi-Yau three-folds with monad bundles.
We show that reinforcement learning can be used successfully to explore monad bundles.
arXiv Detail & Related papers (2021-08-16T19:04:19Z) - Exploring Sparse Expert Models and Beyond [51.90860155810848]
Mixture-of-Experts (MoE) models can achieve promising results with outrageous large amount of parameters but constant computation cost.
We propose a simple method called expert prototyping that splits experts into different prototypes and applies $k$ top-$1$ routing.
This strategy improves the model quality but maintains constant computational costs, and our further exploration on extremely large-scale models reflects that it is more effective in training larger models.
arXiv Detail & Related papers (2021-05-31T16:12:44Z) - Leveraging Passage Retrieval with Generative Models for Open Domain
Question Answering [61.394478670089065]
Generative models for open domain question answering have proven to be competitive, without resorting to external knowledge.
We investigate how much these models can benefit from retrieving text passages, potentially containing evidence.
We observe that the performance of this method significantly improves when increasing the number of retrieved passages.
arXiv Detail & Related papers (2020-07-02T17:44:57Z) - Learning Bijective Feature Maps for Linear ICA [73.85904548374575]
We show that existing probabilistic deep generative models (DGMs) which are tailor-made for image data, underperform on non-linear ICA tasks.
To address this, we propose a DGM which combines bijective feature maps with a linear ICA model to learn interpretable latent structures for high-dimensional data.
We create models that converge quickly, are easy to train, and achieve better unsupervised latent factor discovery than flow-based models, linear ICA, and Variational Autoencoders on images.
arXiv Detail & Related papers (2020-02-18T17:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.