Related papers: Quantum Many-Body Physics Calculations with Large Language Models

Quantum Many-Body Physics Calculations with Large Language Models

URL: http://arxiv.org/abs/2403.03154v2
Date: Thu, 22 Aug 2024 22:42:40 GMT
Title: Quantum Many-Body Physics Calculations with Large Language Models
Authors: Haining Pan, Nayantara Mudur, Will Taranto, Maria Tikhanovskaya, Subhashini Venugopalan, Yasaman Bahri, Michael P. Brenner, Eun-Ah Kim,
Abstract summary: Large language models (LLMs) have demonstrated an unprecedented ability to perform complex tasks in multiple domains. We focus on a broadly used approximation method in quantum physics: the Hartree-Fock method. We design multi-step prompt templates that break down the analytic calculation into standardized steps. We evaluate GPT-4's performance in executing the calculation for 15 research papers from the past decade.
Score: 7.679615503214482
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Large language models (LLMs) have demonstrated an unprecedented ability to perform complex tasks in multiple domains, including mathematical and scientific reasoning. We demonstrate that with carefully designed prompts, LLMs can accurately carry out key calculations in research papers in theoretical physics. We focus on a broadly used approximation method in quantum physics: the Hartree-Fock method, requiring an analytic multi-step calculation deriving approximate Hamiltonian and corresponding self-consistency equations. To carry out the calculations using LLMs, we design multi-step prompt templates that break down the analytic calculation into standardized steps with placeholders for problem-specific information. We evaluate GPT-4's performance in executing the calculation for 15 research papers from the past decade, demonstrating that, with correction of intermediate steps, it can correctly derive the final Hartree-Fock Hamiltonian in 13 cases and makes minor errors in 2 cases. Aggregating across all research papers, we find an average score of 87.5 (out of 100) on the execution of individual calculation steps. Overall, the requisite skill for doing these calculations is at the graduate level in quantum condensed matter theory. We further use LLMs to mitigate the two primary bottlenecks in this evaluation process: (i) extracting information from papers to fill in templates and (ii) automatic scoring of the calculation steps, demonstrating good results in both cases. The strong performance is the first step for developing algorithms that automatically explore theoretical hypotheses at an unprecedented scale.

Related papers

Can LLMs Reason Abstractly Over Math Word Problems Without CoT? Disentangling Abstract Formulation From Arithmetic Computation [39.50584137093401]
Final-answer-based metrics are commonly used for evaluating large language models (LLMs) on math word problems.<n>We show that the final-answer accuracy of Llama-3 and Qwen2.5 (1B-32B) without CoT is overwhelmingly bottlenecked by the arithmetic computation step.<n>Mechanistically, we show that these two skills are composed conjunctively even in a single forward pass without any reasoning steps.
arXiv Detail & Related papers (2025-05-29T17:37:57Z)
Program of Equations Thoughts to Solve Algebra Word Problems [0.0]
Program of Equations Thoughts transforms the task of generating reasoning answers into a two-stage task of predicting equations and generating code.<n>Our method achieves accuracies of 95.3% and 98.0% on the PEN and ALG514 datasets, respectively, setting a new state-of-the-art (SOTA)<n> Zero-shot POET also achieves the SOTA result of 95.5% on the DRAW-1K dataset.
arXiv Detail & Related papers (2025-05-26T16:12:04Z)
IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently [17.525220958618988]
We introduce the Integrated Gated Calculator (IGC), a module that enables Large Language Models to perform arithmetic by emulating a calculator on the GPU. We finetune a Llama model with our module and test it on the BigBench Arithmetic benchmark, where it beats the State of the Art. Our approach takes only a single iteration to run and requires no external tools.
arXiv Detail & Related papers (2025-01-01T00:01:27Z)
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs [76.43407125275202]
o1-like models can emulate human-like long-time thinking during inference. This paper presents the first comprehensive study on the prevalent issue of overthinking in these models. We propose strategies to mitigate overthinking, streamlining reasoning processes without compromising accuracy.
arXiv Detail & Related papers (2024-12-30T18:55:12Z)
Scaling Laws for Mixed quantization in Large Language Models [10.912306313183972]
Post-training quantization of Large Language Models (LLMs) has proven effective in reducing the computational requirements for running inference on these models. In this study, we focus on a straightforward question: When aiming for a specific accuracy or perplexity target for low-precision quantization, how many high-precision numbers or calculations are required to preserve as we scale LLMs to larger sizes?
arXiv Detail & Related papers (2024-10-09T09:45:01Z)
Interpreting and Improving Large Language Models in Arithmetic Calculation [72.19753146621429]
Large language models (LLMs) have demonstrated remarkable potential across numerous applications. In this work, we delve into uncovering a specific mechanism by which LLMs execute calculations. We investigate the potential benefits of selectively fine-tuning these essential heads/MLPs to boost the LLMs' computational performance.
arXiv Detail & Related papers (2024-09-03T07:01:46Z)
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters [27.656263126925815]
We study the scaling of inference-time computation in LLMs. We find that in both cases, the effectiveness of different approaches to scaling test-time compute critically varies depending on the difficulty of the prompt.
arXiv Detail & Related papers (2024-08-06T17:35:05Z)
Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE [68.6018458996143]
We propose a more general dynamic network that can combine both quantization and early exit dynamic network: QuEE. Our algorithm can be seen as a form of soft early exiting or input-dependent compression. The crucial factor of our approach is accurate prediction of the potential accuracy improvement achievable through further computation.
arXiv Detail & Related papers (2024-06-20T15:25:13Z)
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models [12.853829771559916]
Large language models (LLMs) have made fundamental contributions over the last a few years. We show nearly identical results for the harder-seeming problem of computing the gradient of loss function of one layer attention network.
arXiv Detail & Related papers (2024-02-07T00:45:31Z)
QuIP: 2-Bit Quantization of Large Language Models With Guarantees [44.212441764241]
This work studies post-training parameter quantization in large language models (LLMs) We introduce quantization with incoherence processing (QuIP), a new method based on the insight that quantization benefits from $textitincoherent$ weight and Hessian matrices.
arXiv Detail & Related papers (2023-07-25T07:44:06Z)
Evaluating and Improving Tool-Augmented Computation-Intensive Math Reasoning [75.74103236299477]
Chain-of-thought prompting(CoT) and tool augmentation have been validated as effective practices for improving large language models. We propose a new approach that can deliberate the reasoning steps with tool interfaces, namely textbfDELI. Experimental results on CARP and six other datasets show that the proposed DELI mostly outperforms competitive baselines.
arXiv Detail & Related papers (2023-06-04T17:02:59Z)
Representation Learning with Multi-Step Inverse Kinematics: An Efficient and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity. We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level. Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z)
End-to-end resource analysis for quantum interior point methods and portfolio optimization [63.4863637315163]
We provide a complete quantum circuit-level description of the algorithm from problem input to problem output. We report the number of logical qubits and the quantity/depth of non-Clifford T-gates needed to run the algorithm.
arXiv Detail & Related papers (2022-11-22T18:54:48Z)
Quantum Sparse Coding [5.130440339897477]
We develop a quantum-inspired algorithm for sparse coding. The emergence of quantum computers and Ising machines can potentially lead to more accurate estimations. We conduct numerical experiments with simulated data on Lightr's quantum-inspired digital platform.
arXiv Detail & Related papers (2022-09-08T13:00:30Z)
Quantum Algorithms for Data Representation and Analysis [68.754953879193]
We provide quantum procedures that speed-up the solution of eigenproblems for data representation in machine learning. The power and practical use of these subroutines is shown through new quantum algorithms, sublinear in the input matrix's size, for principal component analysis, correspondence analysis, and latent semantic analysis. Results show that the run-time parameters that do not depend on the input's size are reasonable and that the error on the computed model is small, allowing for competitive classification performances.
arXiv Detail & Related papers (2021-04-19T00:41:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.