Inertial Confinement Fusion Forecasting via Large Language Models
- URL: http://arxiv.org/abs/2407.11098v3
- Date: Mon, 14 Oct 2024 22:47:47 GMT
- Title: Inertial Confinement Fusion Forecasting via Large Language Models
- Authors: Mingkai Chen, Taowen Wang, Shihui Cao, James Chenhao Liang, Chuan Liu, Chunshu Wu, Qifan Wang, Ying Nian Wu, Michael Huang, Chuang Ren, Ang Li, Tong Geng, Dongfang Liu,
- Abstract summary: In this study, we introduce $textbfLPI-LLM$, a novel integration of Large Language Models (LLMs) with classical reservoir computing paradigms.
We propose the $textitLLM-anchored Reservoir$, augmented with a $textitFusion-specific Prompt$, enabling accurate forecasting of $textttLPI$-generated-hot electron dynamics during implosion.
We also present $textbfLPI4AI$, the first $textttLPI$ benchmark based
- Score: 48.76222320245404
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Controlled fusion energy is deemed pivotal for the advancement of human civilization. In this study, we introduce $\textbf{LPI-LLM}$, a novel integration of Large Language Models (LLMs) with classical reservoir computing paradigms tailored to address a critical challenge, Laser-Plasma Instabilities ($\texttt{LPI}$), in Inertial Confinement Fusion ($\texttt{ICF}$). Our approach offers several key contributions: Firstly, we propose the $\textit{LLM-anchored Reservoir}$, augmented with a $\textit{Fusion-specific Prompt}$, enabling accurate forecasting of $\texttt{LPI}$-generated-hot electron dynamics during implosion. Secondly, we develop $\textit{Signal-Digesting Channels}$ to temporally and spatially describe the driver laser intensity across time, capturing the unique characteristics of $\texttt{ICF}$ inputs. Lastly, we design the $\textit{Confidence Scanner}$ to quantify the confidence level in forecasting, providing valuable insights for domain experts to design the $\texttt{ICF}$ process. Extensive experiments demonstrate the superior performance of our method, achieving 1.90 CAE, 0.14 $\texttt{top-1}$ MAE, and 0.11 $\texttt{top-5}$ MAE in predicting Hard X-ray ($\texttt{HXR}$) energies emitted by the hot electrons in $\texttt{ICF}$ implosions, which presents state-of-the-art comparisons against concurrent best systems. Additionally, we present $\textbf{LPI4AI}$, the first $\texttt{LPI}$ benchmark based on physical experiments, aimed at fostering novel ideas in $\texttt{LPI}$ research and enhancing the utility of LLMs in scientific exploration. Overall, our work strives to forge an innovative synergy between AI and $\texttt{ICF}$ for advancing fusion energy.
Related papers
- Probabilistically Tightened Linear Relaxation-based Perturbation Analysis for Neural Network Verification [83.25968588249776]
We present a novel framework that combines over-approximation techniques from LiRPA-based approaches with a sampling-based method to compute tight intermediate reachable sets.<n>With negligible computational overhead, $textttPT-LiRPA$ exploiting the estimated reachable sets, significantly tightens the lower and upper linear bounds of a neural network's output.
arXiv Detail & Related papers (2025-07-07T18:45:53Z) - FLARE: Robot Learning with Implicit World Modeling [87.81846091038676]
$textbfFLARE$ integrates predictive latent world modeling into robot policy learning.<n>$textbfFLARE$ achieves state-of-the-art performance, outperforming prior policy learning baselines by up to 26%.<n>Our results establish $textbfFLARE$ as a general and scalable approach for combining implicit world modeling with high-frequency robotic control.
arXiv Detail & Related papers (2025-05-21T15:33:27Z) - InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion [36.27704594180795]
InfiGFusion is a structure-aware fusion framework with a novel textitGraph-on-Logits Distillation (GLD) loss.<n>We show that GLD consistently improves fusion quality and stability.<n>It shows particular strength in complex reasoning tasks, with +35.6 improvement on Multistep Arithmetic and +37.06 on Causal Judgement over SFT.
arXiv Detail & Related papers (2025-05-20T03:55:35Z) - $\text{M}^{\text{3}}$: A Modular World Model over Streams of Tokens [51.65485693709418]
Token-based world models emerged as a promising modular framework, modeling dynamics over token streams while optimizing tokenization separately.
In this paper, we introduce $textMtext3$, a $textbfm$odular $textbfw$orld $textbfm$odel that extends this framework.
$textMtext3$ achieves several improvements from existing literature to enhance agent performance.
arXiv Detail & Related papers (2025-02-17T08:06:10Z) - $H^3$Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMs [7.498844064516196]
Alignment of pretrained LLMs using instruction-based datasets is critical for creating fine-tuned models that reflect human preference.
This paper develops an alignment fusion approach, coined as $H3$Fusion, with three unique characteristics.
It outperforms each individually aligned model by $11.37%$, and it provides stronger robustness compared to the state-of-the-art LLM ensemble approaches by $13.77%$.
arXiv Detail & Related papers (2024-11-26T17:42:38Z) - LEVIS: Large Exact Verifiable Input Spaces for Neural Networks [8.673606921201442]
robustness of neural networks is paramount in safety-critical applications.
We introduce a novel framework, $textttLEVIS$, comprising $textttLEVIS$-$beta$.
We offer a theoretical analysis elucidating the properties of the verifiable balls acquired through $textttLEVIS$-$alpha$ and $textttLEVIS$-$beta$.
arXiv Detail & Related papers (2024-08-16T16:15:57Z) - Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors [58.661454334877256]
Drug-Target binding Affinity (DTA) prediction is essential for drug discovery.
Despite the application of deep learning methods to DTA prediction, the achieved accuracy remain suboptimal.
We propose $k$NN-DTA, a non-representation embedding-based retrieval method adopted on a pre-trained DTA prediction model.
arXiv Detail & Related papers (2024-07-21T15:49:05Z) - Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG [57.14250086701313]
We investigate the extent to which modern LMs generate $n$-grams from their training data.
We develop Rusty-DAWG, a novel search tool inspired by indexing of genomic data.
arXiv Detail & Related papers (2024-06-18T21:31:19Z) - Linear Contextual Bandits with Hybrid Payoff: Revisited [0.8287206589886881]
We study the Linear Contextual problem in the hybrid reward setting.
In this setting every arm's reward model contains arm specific parameters in addition to parameters shared across the reward models of all the arms.
arXiv Detail & Related papers (2024-06-14T15:41:21Z) - Creating an AI Observer: Generative Semantic Workspaces [4.031100721019478]
We introduce the $textbf[G]$enerative $textbf[S]$emantic $textbf[W]$orkspace (GSW))
GSW creates a generative-style Semantic framework, as opposed to a traditionally predefined set of lexicon labels.
arXiv Detail & Related papers (2024-06-07T00:09:13Z) - Transfer Q Star: Principled Decoding for LLM Alignment [105.89114186982972]
Transfer $Q*$ estimates the optimal value function for a target reward $r$ through a baseline model.
Our approach significantly reduces the sub-optimality gap observed in prior SoTA methods.
arXiv Detail & Related papers (2024-05-30T21:36:12Z) - RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis [84.57932472551889]
RALL-E is a robust language modeling method for text-to-speech synthesis.
RALL-E improves the WER of zero-shot TTS from $5.6%$ (without reranking) to $2.5%$ and $1.0%$, respectively.
arXiv Detail & Related papers (2024-04-04T05:15:07Z) - Mechanics of Next Token Prediction with Self-Attention [41.82477691012942]
Transformer-based language models are trained on large datasets to predict the next token given an input sequence.
We show that training self-attention with gradient descent learns an automaton which generates the next token in two distinct steps.
We hope that these findings shed light on how self-attention processes sequential data and pave the path toward demystifying more complex architectures.
arXiv Detail & Related papers (2024-03-12T21:15:38Z) - Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens [138.36729703589512]
We show that $n$-gram language models are still relevant in this era of neural large language models (LLMs)
This was done by modernizing $n$-gram LMs in two aspects. First, we train them at the same data scale as neural LLMs -- 5 trillion tokens.
Second, existing $n$-gram LMs use small $n$ which hinders their performance; we instead allow $n$ to be arbitrarily large, by introducing a new $infty$-gram LM with backoff.
arXiv Detail & Related papers (2024-01-30T19:03:49Z) - Accelerating superconductor discovery through tempered deep learning of
the electron-phonon spectral function [0.0]
We train a deep learning model to predict the electron-phonon spectral function, $alpha2F(omega)$.
We then incorporate domain knowledge of the site-projected phonon density states to impose inductive bias into the model's node attributes and enhance predictions.
This methodological innovation decreases the MAE to 0.18, 29 K, and 28 K, respectively yielding an MAE of 2.1 K for $T_c$.
arXiv Detail & Related papers (2024-01-29T22:44:28Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Reward-Mixing MDPs with a Few Latent Contexts are Learnable [75.17357040707347]
We consider episodic reinforcement learning in reward-mixing Markov decision processes (RMMDPs)
Our goal is to learn a near-optimal policy that nearly maximizes the $H$ time-step cumulative rewards in such a model.
arXiv Detail & Related papers (2022-10-05T22:52:00Z) - Minimax-Optimal Multi-Agent RL in Zero-Sum Markov Games With a
Generative Model [50.38446482252857]
Two-player zero-sum Markov games are arguably the most basic setting in multi-agent reinforcement learning.
We develop a learning algorithm that learns an $varepsilon$-approximate Markov NE policy using $$ widetildeObigg.
We derive a refined regret bound for FTRL that makes explicit the role of variance-type quantities.
arXiv Detail & Related papers (2022-08-22T17:24:55Z) - Interpretable AI forecasting for numerical relativity waveforms of
quasi-circular, spinning, non-precessing binary black hole mergers [1.4438155481047366]
We present a deep-learning artificial intelligence model capable of learning and forecasting the late-inspiral, merger and ringdown of numerical relativity waveforms.
We harnessed the Theta supercomputer at the Argonne Leadership Computing Facility to train our AI model using a training set of 1.5 million waveforms.
Our findings show that artificial intelligence can accurately forecast the dynamical evolution of numerical relativity waveforms.
arXiv Detail & Related papers (2021-10-13T18:14:52Z) - Mixture weights optimisation for Alpha-Divergence Variational Inference [0.0]
This paper focuses on $alpha$-divergence minimisation methods for Variational Inference.
Power Descent, defined for all $alpha neq 1$, is one such algorithm.
First-order approximations allow us to introduce the Renyi Descent, a novel algorithm.
arXiv Detail & Related papers (2021-06-09T14:47:05Z) - ExpFinder: An Ensemble Expert Finding Model Integrating $N$-gram Vector
Space Model and $\mu$CO-HITS [0.3560086794419991]
$textitExpFinder$ is a new ensemble model for expert finding.
It integrates a novel $N$-gram vector space model, denoted as $n$VSM, and a graph-based model, denoted as $textit$mu$CO-HITS$.
arXiv Detail & Related papers (2021-01-18T00:44:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.