Related papers: Neural Scaling Laws for Boosted Jet Tagging

Neural Scaling Laws for Boosted Jet Tagging

URL: http://arxiv.org/abs/2602.15781v1
Date: Tue, 17 Feb 2026 18:13:01 GMT
Title: Neural Scaling Laws for Boosted Jet Tagging
Authors: Matthias Vigl, Nicole Hartman, Michael Kagan, Lukas Heinrich,
Abstract summary: scaling compute, through joint increases in model capacity and dataset size, is the primary driver of performance in modern machine learning.<n>We derive compute optimal scaling laws and identify an effective performance limit that can be consistently approached through increased compute.<n>We then study how the scaling coefficients and performance limits vary with the choice of input features and particle multiplicity.
Score: 0.22399170518036912
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The success of Large Language Models (LLMs) has established that scaling compute, through joint increases in model capacity and dataset size, is the primary driver of performance in modern machine learning. While machine learning has long been an integral component of High Energy Physics (HEP) data analysis workflows, the compute used to train state-of-the-art HEP models remains orders of magnitude below that of industry foundation models. With scaling laws only beginning to be studied in the field, we investigate neural scaling laws for boosted jet classification using the public JetClass dataset. We derive compute optimal scaling laws and identify an effective performance limit that can be consistently approached through increased compute. We study how data repetition, common in HEP where simulation is expensive, modifies the scaling yielding a quantifiable effective dataset size gain. We then study how the scaling coefficients and asymptotic performance limits vary with the choice of input features and particle multiplicity, demonstrating that increased compute reliably drives performance toward an asymptotic limit, and that more expressive, lower-level features can raise the performance limit and improve results at fixed dataset size.

Related papers

Information Capacity: Evaluating the Efficiency of Large Language Models via Text Compression [53.39128997308138]
We introduce information capacity, a measure of model efficiency based on text compression performance.<n> Empirical evaluations on mainstream open-source models show that models of varying sizes within a series exhibit consistent information capacity.<n>A distinctive feature of information capacity is that it incorporates tokenizer efficiency, which affects both input and output token counts.
arXiv Detail & Related papers (2025-11-11T10:07:32Z)
Towards Multi-Fidelity Scaling Laws of Neural Surrogates in CFD [21.38912245186567]
Scaling laws describe how model performance grows with data, parameters and compute.<n>We investigate this trade-off between data fidelity and cost in neural surrogates using low- and high-fidelity simulations.<n>Our experiments reveal compute-performance scaling behavior and exhibit budget-dependent optimal fidelity mixes for the given dataset configuration.
arXiv Detail & Related papers (2025-11-03T18:37:38Z)
The Art of Scaling Reinforcement Learning Compute for LLMs [52.71086085139566]
Reinforcement learning (RL) has become central to training large language models.<n>Despite rapidly rising compute budgets, there is no principled understanding of how to evaluate algorithmic improvements for scaling RL compute.<n>We present the first large-scale systematic study, amounting to more than 400,000 GPU-hours.
arXiv Detail & Related papers (2025-10-15T17:43:03Z)
Compute-Optimal Scaling for Value-Based Deep RL [99.680827753493]
We investigate compute scaling for online, value-based deep RL.<n>Our analysis reveals a nuanced interplay between model size, batch size, and UTD.<n>We provide a mental model for understanding this phenomenon and build guidelines for choosing batch size and UTD.
arXiv Detail & Related papers (2025-08-20T17:54:21Z)
Scaling DRL for Decision Making: A Survey on Data, Network, and Training Budget Strategies [66.83950068218033]
Scaling Laws demonstrate that scaling model parameters and training data enhances learning performance.<n>Despite its potential to improve performance, the integration of scaling laws into deep reinforcement learning has not been fully realized.<n>This review addresses this gap by systematically analyzing scaling strategies in three dimensions: data, network, and training budget.
arXiv Detail & Related papers (2025-08-05T08:03:12Z)
Scaling Laws of Motion Forecasting and Planning - Technical Report [21.486301157587132]
We study the empirical scaling laws of a family of encoder-decoder autoregressive transformer models.<n>We observe a strong correlation between model training loss and model evaluation metrics.<n>We briefly study the utility of training on general logged driving data of other agents to improve the performance of the ego-agent.
arXiv Detail & Related papers (2025-06-09T20:54:23Z)
Scaling Laws for Emulation of Stellar Spectra [0.0]
We provide training guidelines for scaling Transformer-based spectral emulators to achieve optimal performance.<n>Our results suggest that optimal computational resource allocation requires balanced scaling.<n>This study establishes a foundation for developing spectral foundational models with enhanced domain transfer capabilities.
arXiv Detail & Related papers (2025-03-24T12:20:24Z)
SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation [81.36747103102459]
Expressive human pose and shape estimation (EHPS) unifies body, hands, and face motion capture with numerous applications.<n>Current state-of-the-art methods focus on training innovative architectural designs on confined datasets.<n>We investigate the impact of scaling up EHPS towards a family of generalist foundation models.
arXiv Detail & Related papers (2025-01-16T18:59:46Z)
A Solvable Model of Neural Scaling Laws [72.8349503901712]
Large language models with a huge number of parameters, when trained on near internet-sized number of tokens, have been empirically shown to obey neural scaling laws. We propose a statistical model -- a joint generative data model and random feature model -- that captures this neural scaling phenomenology. Key findings are the manner in which the power laws that occur in the statistics of natural datasets are extended by nonlinear random feature maps.
arXiv Detail & Related papers (2022-10-30T15:13:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.