Inertial Confinement Fusion Forecasting via Large Language Models
- URL: http://arxiv.org/abs/2407.11098v3
- Date: Mon, 14 Oct 2024 22:47:47 GMT
- Title: Inertial Confinement Fusion Forecasting via Large Language Models
- Authors: Mingkai Chen, Taowen Wang, Shihui Cao, James Chenhao Liang, Chuan Liu, Chunshu Wu, Qifan Wang, Ying Nian Wu, Michael Huang, Chuang Ren, Ang Li, Tong Geng, Dongfang Liu,
- Abstract summary: In this study, we introduce $textbfLPI-LLM$, a novel integration of Large Language Models (LLMs) with classical reservoir computing paradigms.
We propose the $textitLLM-anchored Reservoir$, augmented with a $textitFusion-specific Prompt$, enabling accurate forecasting of $textttLPI$-generated-hot electron dynamics during implosion.
We also present $textbfLPI4AI$, the first $textttLPI$ benchmark based
- Score: 48.76222320245404
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Controlled fusion energy is deemed pivotal for the advancement of human civilization. In this study, we introduce $\textbf{LPI-LLM}$, a novel integration of Large Language Models (LLMs) with classical reservoir computing paradigms tailored to address a critical challenge, Laser-Plasma Instabilities ($\texttt{LPI}$), in Inertial Confinement Fusion ($\texttt{ICF}$). Our approach offers several key contributions: Firstly, we propose the $\textit{LLM-anchored Reservoir}$, augmented with a $\textit{Fusion-specific Prompt}$, enabling accurate forecasting of $\texttt{LPI}$-generated-hot electron dynamics during implosion. Secondly, we develop $\textit{Signal-Digesting Channels}$ to temporally and spatially describe the driver laser intensity across time, capturing the unique characteristics of $\texttt{ICF}$ inputs. Lastly, we design the $\textit{Confidence Scanner}$ to quantify the confidence level in forecasting, providing valuable insights for domain experts to design the $\texttt{ICF}$ process. Extensive experiments demonstrate the superior performance of our method, achieving 1.90 CAE, 0.14 $\texttt{top-1}$ MAE, and 0.11 $\texttt{top-5}$ MAE in predicting Hard X-ray ($\texttt{HXR}$) energies emitted by the hot electrons in $\texttt{ICF}$ implosions, which presents state-of-the-art comparisons against concurrent best systems. Additionally, we present $\textbf{LPI4AI}$, the first $\texttt{LPI}$ benchmark based on physical experiments, aimed at fostering novel ideas in $\texttt{LPI}$ research and enhancing the utility of LLMs in scientific exploration. Overall, our work strives to forge an innovative synergy between AI and $\texttt{ICF}$ for advancing fusion energy.
Related papers
- $\text{M}^{\text{3}}$: A Modular World Model over Streams of Tokens [51.65485693709418]
Token-based world models emerged as a promising modular framework, modeling dynamics over token streams while optimizing tokenization separately.
In this paper, we introduce $textMtext3$, a $textbfm$odular $textbfw$orld $textbfm$odel that extends this framework.
$textMtext3$ achieves several improvements from existing literature to enhance agent performance.
arXiv Detail & Related papers (2025-02-17T08:06:10Z) - $H^3$Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMs [7.498844064516196]
Alignment of pretrained LLMs using instruction-based datasets is critical for creating fine-tuned models that reflect human preference.
This paper develops an alignment fusion approach, coined as $H3$Fusion, with three unique characteristics.
It outperforms each individually aligned model by $11.37%$, and it provides stronger robustness compared to the state-of-the-art LLM ensemble approaches by $13.77%$.
arXiv Detail & Related papers (2024-11-26T17:42:38Z) - LEVIS: Large Exact Verifiable Input Spaces for Neural Networks [8.673606921201442]
robustness of neural networks is paramount in safety-critical applications.
We introduce a novel framework, $textttLEVIS$, comprising $textttLEVIS$-$beta$.
We offer a theoretical analysis elucidating the properties of the verifiable balls acquired through $textttLEVIS$-$alpha$ and $textttLEVIS$-$beta$.
arXiv Detail & Related papers (2024-08-16T16:15:57Z) - Evaluating $n$-Gram Novelty of Language Models Using Rusty-DAWG [57.14250086701313]
We investigate the extent to which modern LMs generate $n$-grams from their training data.
We develop Rusty-DAWG, a novel search tool inspired by indexing of genomic data.
arXiv Detail & Related papers (2024-06-18T21:31:19Z) - Linear Contextual Bandits with Hybrid Payoff: Revisited [0.8287206589886881]
We study the Linear Contextual problem in the hybrid reward setting.
In this setting every arm's reward model contains arm specific parameters in addition to parameters shared across the reward models of all the arms.
arXiv Detail & Related papers (2024-06-14T15:41:21Z) - Transfer Q Star: Principled Decoding for LLM Alignment [105.89114186982972]
Transfer $Q*$ estimates the optimal value function for a target reward $r$ through a baseline model.
Our approach significantly reduces the sub-optimality gap observed in prior SoTA methods.
arXiv Detail & Related papers (2024-05-30T21:36:12Z) - RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis [84.57932472551889]
RALL-E is a robust language modeling method for text-to-speech synthesis.
RALL-E improves the WER of zero-shot TTS from $5.6%$ (without reranking) to $2.5%$ and $1.0%$, respectively.
arXiv Detail & Related papers (2024-04-04T05:15:07Z) - Mechanics of Next Token Prediction with Self-Attention [41.82477691012942]
Transformer-based language models are trained on large datasets to predict the next token given an input sequence.
We show that training self-attention with gradient descent learns an automaton which generates the next token in two distinct steps.
We hope that these findings shed light on how self-attention processes sequential data and pave the path toward demystifying more complex architectures.
arXiv Detail & Related papers (2024-03-12T21:15:38Z) - Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens [138.36729703589512]
We show that $n$-gram language models are still relevant in this era of neural large language models (LLMs)
This was done by modernizing $n$-gram LMs in two aspects. First, we train them at the same data scale as neural LLMs -- 5 trillion tokens.
Second, existing $n$-gram LMs use small $n$ which hinders their performance; we instead allow $n$ to be arbitrarily large, by introducing a new $infty$-gram LM with backoff.
arXiv Detail & Related papers (2024-01-30T19:03:49Z) - ExpFinder: An Ensemble Expert Finding Model Integrating $N$-gram Vector
Space Model and $\mu$CO-HITS [0.3560086794419991]
$textitExpFinder$ is a new ensemble model for expert finding.
It integrates a novel $N$-gram vector space model, denoted as $n$VSM, and a graph-based model, denoted as $textit$mu$CO-HITS$.
arXiv Detail & Related papers (2021-01-18T00:44:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.