Extreme Learning Machines for Exoplanet Simulations: A Faster, Lightweight Alternative to Deep Learning
- URL: http://arxiv.org/abs/2506.19679v1
- Date: Tue, 24 Jun 2025 14:46:23 GMT
- Title: Extreme Learning Machines for Exoplanet Simulations: A Faster, Lightweight Alternative to Deep Learning
- Authors: Tara P. A. Tahseen, Luís F. Simões, Kai Hou Yip, Nikolaos Nikolaou, João M. Mendonça, Ingo P. Waldmann,
- Abstract summary: Extreme Learning Machine (ELM) is a lightweight, non-gradient-based machine learning algorithm for accelerating complex physical models.<n>We evaluate ELM surrogate models in two test cases with different data structures.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Increasing resolution and coverage of astrophysical and climate data necessitates increasingly sophisticated models, often pushing the limits of computational feasibility. While emulation methods can reduce calculation costs, the neural architectures typically used--optimised via gradient descent--are themselves computationally expensive to train, particularly in terms of data generation requirements. This paper investigates the utility of the Extreme Learning Machine (ELM) as a lightweight, non-gradient-based machine learning algorithm for accelerating complex physical models. We evaluate ELM surrogate models in two test cases with different data structures: (i) sequentially-structured data, and (ii) image-structured data. For test case (i), where the number of samples $N$ >> the dimensionality of input data $d$, ELMs achieve remarkable efficiency, offering a 100,000$\times$ faster training time and a 40$\times$ faster prediction speed compared to a Bi-Directional Recurrent Neural Network (BIRNN), whilst improving upon BIRNN test performance. For test case (ii), characterised by $d >> N$ and image-based inputs, a single ELM was insufficient, but an ensemble of 50 individual ELM predictors achieves comparable accuracy to a benchmark Convolutional Neural Network (CNN), with a 16.4$\times$ reduction in training time, though costing a 6.9$\times$ increase in prediction time. We find different sample efficiency characteristics between the test cases: in test case (i) individual ELMs demonstrate superior sample efficiency, requiring only 0.28% of the training dataset compared to the benchmark BIRNN, while in test case (ii) the ensemble approach requires 78% of the data used by the CNN to achieve comparable results--representing a trade-off between sample efficiency and model complexity.
Related papers
- Efficient dataset construction using active learning and uncertainty-aware neural networks for plasma turbulent transport surrogate models [0.0]
This work demonstrates a proof-of-principle for using uncertainty-aware architectures to construct efficient datasets for surrogate model generation.<n>This strategy was applied again to the plasma turbulent transport problem within tokamak fusion plasmas, specifically the QuaLiKiz quasilinear electrostatic gyrokinetic turbulent transport code.<n>With 45 active learning iterations, moving from a small initial training set of $102$ to a final set of $104$, the resulting models reached a $F_1$ classification performance of 0.8 and a $R2$ regression performance of 0.75 on an independent test set
arXiv Detail & Related papers (2025-07-21T18:15:12Z) - Data Uniformity Improves Training Efficiency and More, with a Convergence Framework Beyond the NTK Regime [9.749891245059596]
We demonstrate that selecting more uniformly distributed data can improve training efficiency while enhancing performance.<n>Specifically, we establish that more uniform (less biased) distribution leads to a larger minimum pairwise distance between data points.<n>We theoretically show that the approximation error of neural networks decreases as $h_min$ increases.
arXiv Detail & Related papers (2025-06-30T17:58:30Z) - MesaNet: Sequence Modeling by Locally Optimal Test-Time Training [67.45211108321203]
We introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer.<n>We show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs.
arXiv Detail & Related papers (2025-06-05T16:50:23Z) - An Investigation on Machine Learning Predictive Accuracy Improvement and Uncertainty Reduction using VAE-based Data Augmentation [2.517043342442487]
Deep generative learning uses certain ML models to learn the underlying distribution of existing data and generate synthetic samples that resemble the real data.
In this study, our objective is to evaluate the effectiveness of data augmentation using variational autoencoder (VAE)-based deep generative models.
We investigated whether the data augmentation leads to improved accuracy in the predictions of a deep neural network (DNN) model trained using the augmented data.
arXiv Detail & Related papers (2024-10-24T18:15:48Z) - Data-Augmented Predictive Deep Neural Network: Enhancing the extrapolation capabilities of non-intrusive surrogate models [0.5735035463793009]
We propose a new deep learning framework, where kernel dynamic mode decomposition (KDMD) is employed to evolve the dynamics of the latent space generated by the encoder part of a convolutional autoencoder (CAE)
After adding the KDMD-decoder-extrapolated data into the original data set, we train the CAE along with a feed-forward deep neural network using the augmented data.
The trained network can predict future states outside the training time interval at any out-of-training parameter samples.
arXiv Detail & Related papers (2024-10-17T09:26:14Z) - Transformer neural networks and quantum simulators: a hybrid approach for simulating strongly correlated systems [1.6494451064539348]
We present a hybrid optimization scheme for neural quantum states (NQS)<n>By using both projective measurements from the computational basis as well as expectation values from other measurement configurations, our pretraining gives access to the sign structure of the state.<n>Our work paves the way for a reliable and efficient optimization of neural quantum states.
arXiv Detail & Related papers (2024-05-31T17:55:27Z) - A Lightweight Measure of Classification Difficulty from Application Dataset Characteristics [4.220363193932374]
We propose an efficient cosine similarity-based classification difficulty measure S.
It is calculated from the number of classes and intra- and inter-class similarity metrics of the dataset.
We show how a practitioner can use this measure to help select an efficient model 6 to 29x faster than through repeated training and testing.
arXiv Detail & Related papers (2024-04-09T03:27:09Z) - Gradual Optimization Learning for Conformational Energy Minimization [69.36925478047682]
Gradual Optimization Learning Framework (GOLF) for energy minimization with neural networks significantly reduces the required additional data.
Our results demonstrate that the neural network trained with GOLF performs on par with the oracle on a benchmark of diverse drug-like molecules.
arXiv Detail & Related papers (2023-11-05T11:48:08Z) - Continuous time recurrent neural networks: overview and application to
forecasting blood glucose in the intensive care unit [56.801856519460465]
Continuous time autoregressive recurrent neural networks (CTRNNs) are a deep learning model that account for irregular observations.
We demonstrate the application of these models to probabilistic forecasting of blood glucose in a critical care setting.
arXiv Detail & Related papers (2023-04-14T09:39:06Z) - Qualitative Data Augmentation for Performance Prediction in VLSI
circuits [2.1227526213206542]
This work proposes generating and evaluating artificial data using generative adversarial networks (GANs) for circuit data.
The training data is obtained by various simulations in the Cadence Virtuoso, HSPICE, and Microcap design environment with TSMC 180nm and 22nm CMOS technology nodes.
The experimental results show that the proposed artificial data generation significantly improves ML models and reduces the percentage error by more than 50% of the original percentage error.
arXiv Detail & Related papers (2023-02-15T10:14:12Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - AutoSimulate: (Quickly) Learning Synthetic Data Generation [70.82315853981838]
We propose an efficient alternative for optimal synthetic data generation based on a novel differentiable approximation of the objective.
We demonstrate that the proposed method finds the optimal data distribution faster (up to $50times$), with significantly reduced training data generation (up to $30times$) and better accuracy ($+8.7%$) on real-world test datasets than previous methods.
arXiv Detail & Related papers (2020-08-16T11:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.