Related papers: Guiding Application Users via Estimation of Computational Resources for Massively Parallel Chemistry Computations

Guiding Application Users via Estimation of Computational Resources for Massively Parallel Chemistry Computations

URL: http://arxiv.org/abs/2509.20667v1
Date: Thu, 25 Sep 2025 02:00:36 GMT
Title: Guiding Application Users via Estimation of Computational Resources for Massively Parallel Chemistry Computations
Authors: Tanzila Tabassum, Omer Subasi, Ajay Panyala, Epiya Ebiapia, Gerald Baumgartner, Erdal Mutlu, P., Sadayappan, Karol Kowalski,
Abstract summary: We develop machine learning strategies to guide application users before they commit to running expensive experiments on a supercomputer.<n>By predicting application execution time, we determine the optimal runtime parameter values such as number of nodes and tile sizes.
Score: 0.39728489102666065
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In this work, we develop machine learning (ML) based strategies to predict resources (costs) required for massively parallel chemistry computations, such as coupled-cluster methods, to guide application users before they commit to running expensive experiments on a supercomputer. By predicting application execution time, we determine the optimal runtime parameter values such as number of nodes and tile sizes. Two key questions of interest to users are addressed. The first is the shortest-time question, where the user is interested in knowing the parameter configurations (number of nodes and tile sizes) to achieve the shortest execution time for a given problem size and a target supercomputer. The second is the cheapest-run question in which the user is interested in minimizing resource usage, i.e., finding the number of nodes and tile size that minimizes the number of node-hours for a given problem size. We evaluate a rich family of ML models and strategies, developed based on the collections of runtime parameter values for the CCSD (Coupled Cluster with Singles and Doubles) application executed on the Department of Energy (DOE) Frontier and Aurora supercomputers. Our experiments show that when predicting the total execution time of a CCSD iteration, a Gradient Boosting (GB) ML model achieves a Mean Absolute Percentage Error (MAPE) of 0.023 and 0.073 for Aurora and Frontier, respectively. In the case where it is expensive to run experiments just to collect data points, we show that active learning can achieve a MAPE of about 0.2 with just around 450 experiments collected from Aurora and Frontier.

Related papers

Kad: A Framework for Proxy-based Test-time Alignment with Knapsack Approximation Deferral [6.949966663998242]
Large language models (LLM) still require further alignment to adhere to downstream task requirements and stylistic preferences.<n>As LLMs continue to scale in terms of size, the computational cost of alignment procedures increase prohibitively.<n>We propose a novel approach to circumvent these costs via proxy-based test-time alignment.
arXiv Detail & Related papers (2025-10-30T21:38:45Z)
$\texttt{SPECS}$: Faster Test-Time Scaling through Speculative Drafts [55.231201692232894]
$textttSPECS$ is a latency-aware test-time scaling method inspired by speculative decoding.<n>Our results show that $textttSPECS$matches or surpasses beam search accuracy while reducing latency by up to $sim$19.1%.
arXiv Detail & Related papers (2025-06-15T05:50:05Z)
Value-Based Deep RL Scales Predictably [100.21834069400023]
We show that value-based off-policy RL methods are predictable despite community lore regarding their pathological behavior.<n>We validate our approach using three algorithms: SAC, BRO, and PQL on DeepMind Control, OpenAI gym, and IsaacGym.
arXiv Detail & Related papers (2025-02-06T18:59:47Z)
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs [76.43407125275202]
o1-like models can emulate human-like long-time thinking during inference.<n>This paper presents the first comprehensive study on the prevalent issue of overthinking in these models.<n>We propose strategies to mitigate overthinking, streamlining reasoning processes without compromising accuracy.
arXiv Detail & Related papers (2024-12-30T18:55:12Z)
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters [27.656263126925815]
We study the scaling of inference-time computation in LLMs. We find that in both cases, the effectiveness of different approaches to scaling test-time compute critically varies depending on the difficulty of the prompt.
arXiv Detail & Related papers (2024-08-06T17:35:05Z)
A Training Data Recipe to Accelerate A* Search with Language Models [3.037409201025504]
Large Language Models (LLMs) with search algorithms like A* holds the promise of enhanced reasoning and scalable inference. We empirically disentangle the requirements of A* search algorithm from the requirements of the LLM to generalise on this task. Our technique reduces the number of iterations required to find the solutions by up to 15x, with a wall-clock speed-up of search up to 5x.
arXiv Detail & Related papers (2024-07-13T19:21:44Z)
Scaling Sparse Fine-Tuning to Large Language Models [67.59697720719672]
Large Language Models (LLMs) are difficult to fully fine-tune due to their sheer number of parameters. We propose SpIEL, a novel sparse finetuning method which maintains an array of parameter indices and the deltas of these parameters relative to their pretrained values. We show that SpIEL is superior to popular parameter-efficient fine-tuning methods like LoRA in terms of performance and comparable in terms of run time.
arXiv Detail & Related papers (2024-01-29T18:43:49Z)
Computationally Budgeted Continual Learning: What Does Matter? [128.0827987414154]
Continual Learning (CL) aims to sequentially train models on streams of incoming data that vary in distribution by preserving previous knowledge while adapting to new data. Current CL literature focuses on restricted access to previously seen data, while imposing no constraints on the computational budget for training. We revisit this problem with a large-scale benchmark and analyze the performance of traditional CL approaches in a compute-constrained setting.
arXiv Detail & Related papers (2023-03-20T14:50:27Z)
Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset [0.15420205433587747]
We present a two-step HPO method as a strategic solution to curbing computational demands and wait times. We present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation.
arXiv Detail & Related papers (2023-02-08T02:38:26Z)
Multipoint-BAX: A New Approach for Efficiently Tuning Particle Accelerator Emittance via Virtual Objectives [47.52324722637079]
We propose a new information-theoretic algorithm, Multipoint-BAX, for black-box optimization on multipoint queries. We use Multipoint-BAX to minimize emittance at the Linac Coherent Light Source (LCLS) and the Facility for Advanced Accelerator Experimental Tests II (FACET-II)
arXiv Detail & Related papers (2022-09-10T04:01:23Z)
CPM-2: Large-scale Cost-effective Pre-trained Language Models [71.59893315671997]
We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference. We introduce knowledge inheritance to accelerate the pre-training process by exploiting existing PLMs instead of training models from scratch. We implement a new inference toolkit, namely InfMoE, for using large-scale PLMs with limited computational resources.
arXiv Detail & Related papers (2021-06-20T15:43:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.