TokaMark: A Comprehensive Benchmark for MAST Tokamak Plasma Models
- URL: http://arxiv.org/abs/2602.10132v2
- Date: Thu, 12 Feb 2026 09:55:27 GMT
- Title: TokaMark: A Comprehensive Benchmark for MAST Tokamak Plasma Models
- Authors: Cécile Rousseau, Samuel Jackson, Rodrigo H. Ordonez-Hurtado, Nicola C. Amorisco, Tobia Boschi, George K. Holt, Andrea Loreti, Eszter Székely, Alexander Whittle, Adriano Agnello, Stanislas Pamela, Alessandra Pascale, Robert Akers, Juan Bernabe Moreno, Sue Thorne, Mykhaylo Zayats,
- Abstract summary: TokaMark is a structured benchmark to evaluate AI models on real experimental data collected from the Mega Ampere Spherical Tokamak (MAST)<n>TokaMark aims to accelerate progress in data-driven AI-based plasma modeling, contributing to the broader goal of achieving sustainable and stable fusion energy.
- Score: 56.94569090844015
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Development and operation of commercially viable fusion energy reactors such as tokamaks require accurate predictions of plasma dynamics from sparse, noisy, and incomplete sensors readings. The complexity of the underlying physics and the heterogeneity of experimental data pose formidable challenges for conventional numerical methods, while simultaneously highlight the promise of modern data-native AI approaches. A major obstacle in realizing this potential is, however, the lack of curated, openly available datasets and standardized benchmarks. Existing fusion datasets are scarce, fragmented across institutions, facility-specific, and inconsistently annotated, which limits reproducibility and prevents a fair and scalable comparison of AI approaches. In this paper, we introduce TokaMark, a structured benchmark to evaluate AI models on real experimental data collected from the Mega Ampere Spherical Tokamak (MAST). TokaMark provides a comprehensive suite of tools designed to (i) unify access to multi-modal heterogeneous fusion data, and (ii) harmonize formats, metadata, temporal alignment and evaluation protocols to enable consistent cross-model and cross-task comparisons. The benchmark includes a curated list of 14 tasks spanning a range of physical mechanisms, exploiting a variety of diagnostics and covering multiple operational use cases. A baseline model is provided to facilitate transparent comparison and validation within a unified framework. By establishing a unified benchmark for both the fusion and AI-for-science communities, TokaMark aims to accelerate progress in data-driven AI-based plasma modeling, contributing to the broader goal of achieving sustainable and stable fusion energy. The benchmark, documentation, and tooling will be fully open sourced upon acceptance to encourage community adoption and contribution.
Related papers
- Beyond Raw Detection Scores: Markov-Informed Calibration for Boosting Machine-Generated Text Detection [105.14032334647932]
Machine-generated texts (MGTs) pose risks such as disinformation and phishing, highlighting the need for reliable detection.<n> Metric-based methods, which extract statistically distinguishable features of MGTs, are often more practical than complex model-based methods that are prone to overfitting.<n>We propose a Markov-informed score calibration strategy that models two relationships of context detection scores that may aid calibration.
arXiv Detail & Related papers (2026-02-08T16:06:12Z) - SERM: Self-Evolving Relevance Model with Agent-Driven Learning from Massive Query Streams [53.78257200138774]
We propose a Self-Evolving Relevance Model approach (SERM), which comprises two complementary multi-agent modules.<n>We evaluate SERM in a large-scale industrial setting, which serves billions of user requests daily.
arXiv Detail & Related papers (2026-01-14T14:31:16Z) - The Data Fusion Labeler (dFL): Challenges and Solutions to Data Harmonization, Labeling, and Provenance in Fusion Energy [3.476400092654536]
Fusion energy research increasingly depends on the ability to integrate heterogeneous, multimodal datasets.<n>The Data Fusion Labeler (dFL) performs uncertainty-aware data harmonization, schema-compliant data fusion, and provenance-rich manual and automated labeling at scale.
arXiv Detail & Related papers (2025-11-12T20:33:40Z) - RoHOI: Robustness Benchmark for Human-Object Interaction Detection [84.78366452133514]
Human-Object Interaction (HOI) detection is crucial for robot-human assistance, enabling context-aware support.<n>We introduce the first benchmark for HOI detection, evaluating model resilience under diverse challenges.<n>Our benchmark, RoHOI, includes 20 corruption types based on the HICO-DET and V-COCO datasets and a new robustness-focused metric.
arXiv Detail & Related papers (2025-07-12T01:58:04Z) - MoCA: Multi-modal Cross-masked Autoencoder for Digital Health Measurements [2.8493802389913694]
We propose the Multi-modal Cross-masked Autoencoder (MoCA), a self-supervised learning framework that combines transformer architecture with masked autoencoder (MAE) methodology.<n>MoCA demonstrates strong performance boosts across reconstruction and downstream classification tasks on diverse benchmark datasets.<n>Our approach offers a novel solution for leveraging unlabeled multi-modal wearable data while handling missing modalities, with broad applications across digital health domains.
arXiv Detail & Related papers (2025-06-02T21:07:25Z) - EVA-MILP: Towards Standardized Evaluation of MILP Instance Generation [13.49043811341421]
Mixed-Integer Linear Programming (MILP) is fundamental to solving complex decision-making problems.<n>The proliferation of MILP instance generation methods, driven by machine learning's demand for diverse datasets, has significantly outpaced standardized evaluation techniques.<n>This paper introduces a comprehensive benchmark framework designed for the systematic and objective evaluation of MILP instance generation methods.
arXiv Detail & Related papers (2025-05-30T16:42:15Z) - Single Domain Generalization with Model-aware Parametric Batch-wise Mixup [22.709796153794507]
Single Domain Generalization remains a formidable challenge in the field of machine learning.<n>We propose a novel data augmentation approach, named as Model-aware Parametric Batch-wise Mixup.<n>By exploiting inter-feature correlations, the parameterized mixup generator introduces additional versatility in combining features across a batch of instances.
arXiv Detail & Related papers (2025-02-22T03:45:18Z) - Energy-based Automated Model Evaluation [19.90797626200033]
We propose a novel measure -- Meta-Distribution Energy (MDE) -- that allows the AutoEval framework to be both more efficient and effective.
MDE establishes a meta-distribution statistic, on the information (energy) associated with individual samples, then offer a smoother representation enabled by energy-based learning.
We provide extensive experiments across modalities, datasets and different architectural backbones to validate MDE's validity.
arXiv Detail & Related papers (2024-01-23T11:54:09Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal
Sentiment Analysis [96.46952672172021]
Bi-Bimodal Fusion Network (BBFN) is a novel end-to-end network that performs fusion on pairwise modality representations.
Model takes two bimodal pairs as input due to known information imbalance among modalities.
arXiv Detail & Related papers (2021-07-28T23:33:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.