Introduction to the Usage of Open Data from the Large Hadron Collider for Computer Scientists in the Context of Machine Learning
- URL: http://arxiv.org/abs/2501.06896v1
- Date: Sun, 12 Jan 2025 18:19:28 GMT
- Title: Introduction to the Usage of Open Data from the Large Hadron Collider for Computer Scientists in the Context of Machine Learning
- Authors: Timo Saala, Matthias Schott,
- Abstract summary: We convert open data from the Large Hadron Collider to pandas DataFrames, a well-known format in computer science.
This paper aims to serve as a starting point for future interdisciplinary collaborations between computer scientists and physicists.
- Score: 0.0
- License:
- Abstract: Deep learning techniques have evolved rapidly in recent years, significantly impacting various scientific fields, including experimental particle physics. To effectively leverage the latest developments in computer science for particle physics, a strengthened collaboration between computer scientists and physicists is essential. As all machine learning techniques depend on the availability and comprehensibility of extensive data, clear data descriptions and commonly used data formats are prerequisites for successful collaboration. In this study, we converted open data from the Large Hadron Collider, recorded in the ROOT data format commonly used in high-energy physics, to pandas DataFrames, a well-known format in computer science. Additionally, we provide a brief introduction to the data's content and interpretation. This paper aims to serve as a starting point for future interdisciplinary collaborations between computer scientists and physicists, fostering closer ties and facilitating efficient knowledge exchange.
Related papers
- Enabling High Data Throughput Reinforcement Learning on GPUs: A Domain Agnostic Framework for Data-Driven Scientific Research [90.91438597133211]
We introduce WarpSci, a framework designed to overcome crucial system bottlenecks in the application of reinforcement learning.
We eliminate the need for data transfer between the CPU and GPU, enabling the concurrent execution of thousands of simulations.
arXiv Detail & Related papers (2024-08-01T21:38:09Z) - The Future of Data Science Education [0.11566458078238004]
The School of Data Science at the University of Virginia has developed a novel model for the definition of Data Science.
This paper will present the core features of the model and explain how it unifies various concepts going far beyond the analytics component of AI.
arXiv Detail & Related papers (2024-07-16T15:11:54Z) - Privacy-Preserving Graph Machine Learning from Data to Computation: A
Survey [67.7834898542701]
We focus on reviewing privacy-preserving techniques of graph machine learning.
We first review methods for generating privacy-preserving graph data.
Then we describe methods for transmitting privacy-preserved information.
arXiv Detail & Related papers (2023-07-10T04:30:23Z) - Approach to Data Science with Multiscale Information Theory [0.0]
Data Science is a multidisciplinary field that plays a crucial role in extracting valuable insights from large and intricate datasets.
Within the realm of Data Science, two fundamental components are Information Theory (IT) and Statistical Mechanics (SM)
In this paper, we apply this data science framework to a large and intricate mechanical system composed of particles.
arXiv Detail & Related papers (2023-05-23T01:08:50Z) - Modeling Information Change in Science Communication with Semantically
Matched Paraphrases [50.67030449927206]
SPICED is the first paraphrase dataset of scientific findings annotated for degree of information change.
SPICED contains 6,000 scientific finding pairs extracted from news stories, social media discussions, and full texts of original papers.
Models trained on SPICED improve downstream performance on evidence retrieval for fact checking of real-world scientific claims.
arXiv Detail & Related papers (2022-10-24T07:44:38Z) - Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models.
The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z) - A Computational Inflection for Scientific Discovery [48.176406062568674]
We stand at the foot of a significant inflection in the trajectory of scientific discovery.
As society continues on its fast-paced digital transformation, so does humankind's collective scientific knowledge.
Computer science is poised to ignite a revolution in the scientific process itself.
arXiv Detail & Related papers (2022-05-04T11:36:54Z) - Shared Data and Algorithms for Deep Learning in Fundamental Physics [4.914920952758052]
We introduce a collection of datasets from fundamental physics research -- including particle physics, astroparticle physics, and hadron- and nuclear physics.
These datasets, containing had top quarks, cosmic-ray induced air showers, phase transitions in hadronic matter, and generator-level histories, are made public.
We present a simple yet flexible graph-based neural network architecture that can easily be applied to a wide range of supervised learning tasks.
arXiv Detail & Related papers (2021-07-01T18:00:00Z) - Computational Skills by Stealth in Secondary School Data Science [16.960800464621993]
We discuss a proposal for the stealth development of computational skills in students' first exposure to data science.
The intent of this approach is to support students, regardless of interest and self-efficacy in coding, in becoming data-driven learners.
arXiv Detail & Related papers (2020-10-08T09:11:51Z) - A Data Scientist's Guide to Streamflow Prediction [55.22219308265945]
We focus on the element of hydrologic rainfall--runoff models and their application to forecast floods and predict streamflow.
This guide aims to help interested data scientists gain an understanding of the problem, the hydrologic concepts involved, and the details that come up along the way.
arXiv Detail & Related papers (2020-06-05T08:04:37Z) - How do Data Science Workers Collaborate? Roles, Workflows, and Tools [30.725728321928823]
We conducted an online survey with 183 participants who work in various aspects of data science.
We found that data science teams are extremely collaborative and work with a variety of stakeholders and tools.
arXiv Detail & Related papers (2020-01-18T15:11:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.