Related papers: FM4NPP: A Scaling Foundation Model for Nuclear and Particle Physics

FM4NPP: A Scaling Foundation Model for Nuclear and Particle Physics

URL: http://arxiv.org/abs/2508.14087v1
Date: Wed, 13 Aug 2025 15:05:06 GMT
Title: FM4NPP: A Scaling Foundation Model for Nuclear and Particle Physics
Authors: David Park, Shuhang Li, Yi Huang, Xihaier Luo, Haiwang Yu, Yeonju Go, Christopher Pinkenburg, Yuewei Lin, Shinjae Yoo, Joseph Osborn, Jin Huang, Yihui Ren,
Abstract summary: We introduce a new dataset with more than 11 million particle collision events and a suite of downstream tasks and labeled data for evaluation.<n>We propose a novel self-supervised training method for detector data and demonstrate its neural scalability with models that feature up to 188 million parameters.<n>With frozen weights and task-specific adapters, this FM consistently outperforms baseline models across all downstream tasks.
Score: 9.522345388801563
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models have revolutionized artificial intelligence by enabling large, generalizable models trained through self-supervision. This paradigm has inspired the development of scientific foundation models (FMs). However, applying this capability to experimental particle physics is challenging due to the sparse, spatially distributed nature of detector data, which differs dramatically from natural language. This work addresses if an FM for particle physics can scale and generalize across diverse tasks. We introduce a new dataset with more than 11 million particle collision events and a suite of downstream tasks and labeled data for evaluation. We propose a novel self-supervised training method for detector data and demonstrate its neural scalability with models that feature up to 188 million parameters. With frozen weights and task-specific adapters, this FM consistently outperforms baseline models across all downstream tasks. The performance also exhibits robust data-efficient adaptation. Further analysis reveals that the representations extracted by the FM are task-agnostic but can be specialized via a single linear mapping for different downstream tasks.

Related papers

PhysiX: A Foundation Model for Physics Simulations [27.359872113159405]
We introduce PhysiX, the first large-scale foundation model for physics simulation.<n>We show that PhysiX effectively addresses the data bottleneck, outperforming task-specific baselines.<n>Our results indicate that knowledge learned from natural videos can be successfully transferred to physics simulation.
arXiv Detail & Related papers (2025-06-21T18:10:12Z)
Can Test-Time Scaling Improve World Foundation Model? [67.82670175383761]
We introduce SWIFT, a test-time scaling framework tailored for world foundation models (WFMs)<n> SWIFT integrates our WFM evaluation toolkit with process-level inference strategies, including fast tokenization, probability-based Top-K pruning, and efficient beam search.<n>Our findings reveal that test-time scaling laws hold for WFMs and that SWIFT provides a scalable and effective pathway for improving WFM inference without retraining or increasing model size.
arXiv Detail & Related papers (2025-03-31T17:07:37Z)
DINAMO: Dynamic and INterpretable Anomaly MOnitoring for Large-Scale Particle Physics Experiments [0.0]
We present DINAMO: a novel, interpretable, robust, and scalable DQM framework.<n>Our approach constructs evolving histogram templates with built-in uncertainties.<n>The statistical variant is being commissioned in the LHCb experiment at the Large Hadron Collider.
arXiv Detail & Related papers (2025-01-31T15:51:41Z)
Pretraining Billion-scale Geospatial Foundational Models on Frontier [0.16492989697868893]
Foundation Models (FMs) are trained with internet-scale unlabeled data via self-supervised learning. We investigate billion scale FMs and HPC training profiles for geospatial applications by pretraining on publicly available data. Our larger 3B parameter size model achieves up to 30% improvement in top1 scene classification accuracy.
arXiv Detail & Related papers (2024-04-17T19:16:32Z)
Masked Particle Modeling on Sets: Towards Self-Supervised High Energy Physics Foundation Models [4.299997052226609]
Masked particle modeling (MPM) is a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs. We study the efficacy of the method in samples of high energy jets at collider physics experiments.
arXiv Detail & Related papers (2024-01-24T15:46:32Z)
Learning from models beyond fine-tuning [78.20895343699658]
Learn From Model (LFM) focuses on the research, modification, and design of foundation models (FM) based on the model interface.<n>The study of LFM techniques can be broadly categorized into five major areas: model tuning, model distillation, model reuse, meta learning and model editing.<n>This paper gives a comprehensive review of the current methods based on FM from the perspective of LFM.
arXiv Detail & Related papers (2023-10-12T10:20:36Z)
Delving Deeper into Data Scaling in Masked Image Modeling [145.36501330782357]
We conduct an empirical study on the scaling capability of masked image modeling (MIM) methods for visual recognition. Specifically, we utilize the web-collected Coyo-700M dataset. Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models.
arXiv Detail & Related papers (2023-05-24T15:33:46Z)
Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z)
The Diminishing Returns of Masked Language Models to Science [0.7549732580284559]
We evaluate the impact of training data, model size, pretraining and finetuning time on 12 downstream scientific tasks. We find that increasing model sizes, training data, or compute time does not always lead to significant improvements.
arXiv Detail & Related papers (2022-05-23T14:35:08Z)
MoEfication: Conditional Computation of Transformer Models for Efficient Inference [66.56994436947441]
Transformer-based pre-trained language models can achieve superior performance on most NLP tasks due to large parameter capacity, but also lead to huge computation cost. We explore to accelerate large-model inference by conditional computation based on the sparse activation phenomenon. We propose to transform a large model into its mixture-of-experts (MoE) version with equal model size, namely MoEfication.
arXiv Detail & Related papers (2021-10-05T02:14:38Z)
Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling [86.9726984929758]
We focus on the integration of incomplete physics models into deep generative models. We propose a VAE architecture in which a part of the latent space is grounded by physics. We demonstrate generative performance improvements over a set of synthetic and real-world datasets.
arXiv Detail & Related papers (2021-02-25T20:28:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.