Related papers: Leveraging universality of jet taggers through transfer learning

Leveraging universality of jet taggers through transfer learning

URL: http://arxiv.org/abs/2203.06210v1
Date: Fri, 11 Mar 2022 19:05:26 GMT
Title: Leveraging universality of jet taggers through transfer learning
Authors: Fr\'ed\'eric A. Dreyer, Rados{\l}aw Grabarczyk and Pier Francesco Monni
Abstract summary: In this article, we explore the use of transfer learning techniques to develop fast and data-efficient jet taggers. We find that one can obtain reliable taggers using an order of magnitude less data with a corresponding speed-up of the training process. This offers a promising avenue to facilitate the use of such tools in collider physics experiments.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A significant challenge in the tagging of boosted objects via machine-learning technology is the prohibitive computational cost associated with training sophisticated models. Nevertheless, the universality of QCD suggests that a large amount of the information learnt in the training is common to different physical signals and experimental setups. In this article, we explore the use of transfer learning techniques to develop fast and data-efficient jet taggers that leverage such universality. We consider the graph neural networks LundNet and ParticleNet, and introduce two prescriptions to transfer an existing tagger into a new signal based either on fine-tuning all the weights of a model or alternatively on freezing a fraction of them. In the case of $W$-boson and top-quark tagging, we find that one can obtain reliable taggers using an order of magnitude less data with a corresponding speed-up of the training process. Moreover, while keeping the size of the training data set fixed, we observe a speed-up of the training by up to a factor of three. This offers a promising avenue to facilitate the use of such tools in collider physics experiments.

Related papers

LOTUS: Improving Transformer Efficiency with Sparsity Pruning and Data Lottery Tickets [0.0]
Vision transformers have revolutionized computer vision, but their computational demands present challenges for training and deployment. This paper introduces LOTUS, a novel method that leverages data lottery ticket selection and sparsity pruning to accelerate vision transformer training while maintaining accuracy.
arXiv Detail & Related papers (2024-05-01T23:30:12Z)
Efficient Asynchronous Federated Learning with Sparsification and Quantization [55.6801207905772]
Federated Learning (FL) is attracting more and more attention to collaboratively train a machine learning model without transferring raw data. FL generally exploits a parameter server and a large number of edge devices during the whole process of the model training. We propose TEASQ-Fed to exploit edge devices to asynchronously participate in the training process by actively applying for tasks.
arXiv Detail & Related papers (2023-12-23T07:47:07Z)
SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations [76.45009891152178]
Pretraining-finetuning approach can alleviate the labeling burden by fine-tuning a pre-trained backbone across various downstream datasets as well as tasks. We show, for the first time, that general representations learning can be achieved through the task of occupancy prediction. Our findings will facilitate the understanding of LiDAR points and pave the way for future advancements in LiDAR pre-training.
arXiv Detail & Related papers (2023-09-19T11:13:01Z)
Fourier neural operator for learning solutions to macroscopic traffic flow models: Application to the forward and inverse problems [7.429546479314462]
We study a neural operator framework for learning solutions to nonlinear hyperbolic partial differential equations. An operator is trained to map heterogeneous and sparse traffic input data to the complete macroscopic traffic state. We found superior accuracy in predicting the density dynamics of a ring-road network and urban signalized road.
arXiv Detail & Related papers (2023-08-14T10:22:51Z)
Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z)
Online Data Selection for Federated Learning with Limited Storage [53.46789303416799]
Federated Learning (FL) has been proposed to achieve distributed machine learning among networked devices. The impact of on-device storage on the performance of FL is still not explored. In this work, we take the first step to consider the online data selection for FL with limited on-device storage.
arXiv Detail & Related papers (2022-09-01T03:27:33Z)
Continual Learning with Transformers for Image Classification [12.028617058465333]
In computer vision, neural network models struggle to continually learn new concepts without forgetting what has been learnt in the past. We develop a solution called Adaptive Distillation of Adapters (ADA), which is developed to perform continual learning. We empirically demonstrate on different classification tasks that this method maintains a good predictive performance without retraining the model.
arXiv Detail & Related papers (2022-06-28T15:30:10Z)
Physics-enhanced deep surrogates for partial differential equations [30.731686639510517]
We present a "physics-enhanced deep-surrogate" ("PEDS") approach towards developing fast surrogate models for complex physical systems. Specifically, a combination of a low-fidelity, explainable physics simulator and a neural network generator is proposed, which is trained end-to-end to globally match the output of an expensive high-fidelity numerical solver.
arXiv Detail & Related papers (2021-11-10T18:43:18Z)
ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training [65.68511423300812]
We propose ProgFed, a progressive training framework for efficient and effective federated learning. ProgFed inherently reduces computation and two-way communication costs while maintaining the strong performance of the final models. Our results show that ProgFed converges at the same rate as standard training on full models.
arXiv Detail & Related papers (2021-10-11T14:45:00Z)
Sparsifying Transformer Models with Trainable Representation Pooling [5.575448433529451]
We propose a novel method to sparsify attention in the Transformer model by learning to select the most-informative token representations during the training process. A reduction of quadratic time and memory complexity to sublinear was achieved due to a robust trainable top-$k$ operator.
arXiv Detail & Related papers (2020-09-10T22:49:39Z)
Fast-Convergent Federated Learning [82.32029953209542]
Federated learning is a promising solution for distributing machine learning tasks through modern networks of mobile devices. We propose a fast-convergent federated learning algorithm, called FOLB, which performs intelligent sampling of devices in each round of model training.
arXiv Detail & Related papers (2020-07-26T14:37:51Z)
Ternary Compression for Communication-Efficient Federated Learning [17.97683428517896]
Federated learning provides a potential solution to privacy-preserving and secure machine learning. We propose a ternary federated averaging protocol (T-FedAvg) to reduce the upstream and downstream communication of federated learning systems. Our results show that the proposed T-FedAvg is effective in reducing communication costs and can even achieve slightly better performance on non-IID data.
arXiv Detail & Related papers (2020-03-07T11:55:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.