Related papers: Interpolation-Driven Machine Learning Approaches for Plume Shine Dose Estimation: A Comparison of XGBoost, Random Forest, and TabNet

Interpolation-Driven Machine Learning Approaches for Plume Shine Dose Estimation: A Comparison of XGBoost, Random Forest, and TabNet

URL: http://arxiv.org/abs/2602.19584v1
Date: Mon, 23 Feb 2026 08:12:49 GMT
Title: Interpolation-Driven Machine Learning Approaches for Plume Shine Dose Estimation: A Comparison of XGBoost, Random Forest, and TabNet
Authors: Biswajit Sadhu, Kalpak Gupte, Trijit Sadhu, S. Anand,
Abstract summary: An assisted machine learning framework was developed for plume shine dose estimation.<n>The framework was developed using discrete dose datasets generated with the pyEIADOS suite for 17 gamma-emitting radionuclides.<n>Interpretability analysis using permutation importance and attention-based feature attribution revealed that performance differences stem from how the models utilize input features.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the success of machine learning (ML) in surrogate modeling, its use in radiation dose assessment is limited by safety-critical constraints, scarce training-ready data, and challenges in selecting suitable architectures for physics-dominated systems. Within this context, rapid and accurate plume shine dose estimation serves as a practical test case, as it is critical for nuclear facility safety assessment and radiological emergency response, while conventional photon-transport-based calculations remain computationally expensive. In this work, an interpolation-assisted ML framework was developed using discrete dose datasets generated with the pyDOSEIA suite for 17 gamma-emitting radionuclides across varying downwind distances, release heights, and atmospheric stability categories. The datasets were augmented using shape-preserving interpolation to construct dense, high-resolution training data. Two tree-based ML models (Random Forest and XGBoost) and one deep learning (DL) model (TabNet) were evaluated to examine predictive performance and sensitivity to dataset resolution. All models showed higher prediction accuracy with the interpolated high-resolution dataset than with the discrete data; however, XGBoost consistently achieved the highest accuracy. Interpretability analysis using permutation importance (tree-based models) and attention-based feature attribution (TabNet) revealed that performance differences stem from how the models utilize input features. Tree-based models focus mainly on dominant geometry-dispersion features (release height, stability category, and downwind distance), treating radionuclide identity as a secondary input, whereas TabNet distributes attention more broadly across multiple variables. For practical deployment, a web-based GUI was developed for interactive scenario evaluation and transparent comparison with photon-transport reference calculations.

Related papers

TokaMark: A Comprehensive Benchmark for MAST Tokamak Plasma Models [56.94569090844015]
TokaMark is a structured benchmark to evaluate AI models on real experimental data collected from the Mega Ampere Spherical Tokamak (MAST)<n>TokaMark aims to accelerate progress in data-driven AI-based plasma modeling, contributing to the broader goal of achieving sustainable and stable fusion energy.
arXiv Detail & Related papers (2026-02-05T16:49:44Z)
Evaluating Ensemble and Deep Learning Models for Static Malware Detection with Dimensionality Reduction Using the EMBER Dataset [0.0]
This study investigates the effectiveness of several machine learning algorithms for static malware detection using the EMBER dataset.<n>We evaluate eight classification models: LightGBM, XGBoost, CatBoost, Random Forest, Extra Trees, HistGradientBoosting, k-Nearest Neighbors (KNN), and TabNet.<n>The models are assessed on accuracy, precision, recall, F1 score, and AUC to examine both predictive performance and robustness.
arXiv Detail & Related papers (2025-07-22T18:45:10Z)
NeuralSurv: Deep Survival Analysis with Bayesian Uncertainty Quantification [42.418429168532406]
We introduce NeuralSurv, the first deep survival model to incorporate Bayesian uncertainty quantification.<n>For efficient posterior inference, we introduce a mean-field variational algorithm with coordinate-ascent updates that scale linearly in model size.<n>In experiments, NeuralSurv delivers superior calibration compared to state-of-the-art deep survival models.
arXiv Detail & Related papers (2025-05-16T09:53:21Z)
Context Matters: Query-aware Dynamic Long Sequence Modeling of Gigapixel Images [4.3565203412433195]
Whole slide image (WSI) analysis presents significant computational challenges due to the massive number of patches in gigapixel images.<n>We propose Querent, i.e., the query-aware long contextual dynamic modeling framework.<n>Our approach dramatically reduces computational overhead while preserving global perception to model fine-grained patch correlations.
arXiv Detail & Related papers (2025-01-31T09:29:21Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
A Causal Graph-Enhanced Gaussian Process Regression for Modeling Engine-out NOx [0.0]
We develop and validate a probabilistic model to predict engine-out NOx emissions using Gaussian process regression.<n>We conclude that our model provides an improvement in predictive performance when using an input window and a deep kernel structure.
arXiv Detail & Related papers (2024-10-24T04:23:57Z)
OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries. OPUS incorporates a suite of non-trivial strategies to enhance model performance. Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z)
From Data to Insights: A Covariate Analysis of the IARPA BRIAR Dataset for Multimodal Biometric Recognition Algorithms at Altitude and Range [13.42292577384284]
This paper focuses on fused whole body biometrics performance in the IARPA BRIAR dataset, specifically focusing on UAV platforms, elevated positions, and up to 1000 meters. The dataset includes outdoor videos compared with indoor images and controlled gait recordings.
arXiv Detail & Related papers (2024-09-03T00:58:50Z)
Peaking into the Black-box: Prediction Intervals Give Insight into Data-driven Quadrotor Model Reliability [0.6215404942415159]
Prediction intervals (PIs) may be employed to provide insight into consistency and accuracy of model predictions. This paper estimates such PIs for bootstrap and Artificial Neural Network (ANN) quadrotor aerodynamic models. It is found that the ANN-based PIs widen considerably when extrapolating and remain constant, or shrink, when interpolating.
arXiv Detail & Related papers (2024-08-12T09:57:00Z)
Machine Learning for ALSFRS-R Score Prediction: Making Sense of the Sensor Data [44.99833362998488]
Amyotrophic Lateral Sclerosis (ALS) is a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options. The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app.
arXiv Detail & Related papers (2024-07-10T19:17:23Z)
DiffusionEngine: Diffusion Model is Scalable Data Engine for Object Detection [41.436817746749384]
Diffusion Model is a scalable data engine for object detection. DiffusionEngine (DE) provides high-quality detection-oriented training pairs in a single stage.
arXiv Detail & Related papers (2023-09-07T17:55:01Z)
DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation [50.08080424613603]
Long-range correlation is essential for accurate monocular depth estimation. We propose to leverage the Transformer to model this global context with an effective attention mechanism. Our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins.
arXiv Detail & Related papers (2022-03-27T05:03:56Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.