Related papers: FluenceFormer: Transformer-Driven Multi-Beam Fluence Map Regression for Radiotherapy Planning

FluenceFormer: Transformer-Driven Multi-Beam Fluence Map Regression for Radiotherapy Planning

URL: http://arxiv.org/abs/2512.22425v1
Date: Sat, 27 Dec 2025 01:12:15 GMT
Title: FluenceFormer: Transformer-Driven Multi-Beam Fluence Map Regression for Radiotherapy Planning
Authors: Ujunwa Mgboh, Rafi Ibn Sultan, Joshua Kim, Kundan Thind, Dongxiao Zhu,
Abstract summary: We introduce textbfFluenceFormer, a backbone-agnostic transformer framework for direct, geometry-aware fluence regression.<n>FluenceFormer with Swin UNETR achieves the strongest performance among the evaluated models.
Score: 4.066732323672965
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Fluence map prediction is central to automated radiotherapy planning but remains an ill-posed inverse problem due to the complex relationship between volumetric anatomy and beam-intensity modulation. Convolutional methods in prior work often struggle to capture long-range dependencies, which can lead to structurally inconsistent or physically unrealizable plans. We introduce \textbf{FluenceFormer}, a backbone-agnostic transformer framework for direct, geometry-aware fluence regression. The model uses a unified two-stage design: Stage~1 predicts a global dose prior from anatomical inputs, and Stage~2 conditions this prior on explicit beam geometry to regress physically calibrated fluence maps. Central to the approach is the \textbf{Fluence-Aware Regression (FAR)} loss, a physics-informed objective that integrates voxel-level fidelity, gradient smoothness, structural consistency, and beam-wise energy conservation. We evaluate the generality of the framework across multiple transformer backbones, including Swin UNETR, UNETR, nnFormer, and MedFormer, using a prostate IMRT dataset. FluenceFormer with Swin UNETR achieves the strongest performance among the evaluated models and improves over existing benchmark CNN and single-stage methods, reducing Energy Error to $\mathbf{4.5\%}$ and yielding statistically significant gains in structural fidelity ($p < 0.05$).

Related papers

Plug-and-Play Diffusion Meets ADMM: Dual-Variable Coupling for Robust Medical Image Reconstruction [45.25461515976432]
Plug-and-Play diffusion prior (DP) frameworks have emerged as a powerful paradigm for imaging reconstruction.<n>We present a novel approach to resolving bias-hallucination trade-off, achieving state-of-the-art gradients with significantly accelerated convergence.
arXiv Detail & Related papers (2026-02-26T16:58:43Z)
SpanNorm: Reconciling Training Stability and Performance in Deep Transformers [55.100133502295996]
We propose SpanNorm, a novel technique designed to resolve the dilemma by integrating the strengths of both paradigms.<n>We provide a theoretical analysis demonstrating that SpanNorm, combined with a principled scaling strategy, maintains bounded signal variance throughout the network.<n> Empirically, SpanNorm consistently outperforms standard normalization schemes in both dense and Mixture-of-Experts (MoE) scenarios.
arXiv Detail & Related papers (2026-01-30T05:21:57Z)
Simba: Towards High-Fidelity and Geometrically-Consistent Point Cloud Completion via Transformation Diffusion [31.34032485865941]
We introduce Simba, a novel framework that reformulates point-wise transformation regression as a distribution learning problem.<n>Our approach integrates symmetry priors with the powerful generative capabilities of diffusion models, avoiding instance-specific memorization.
arXiv Detail & Related papers (2025-11-20T09:02:42Z)
Cross-Distribution Diffusion Priors-Driven Iterative Reconstruction for Sparse-View CT [19.18392754322368]
Sparse-View CT (SVCT) reconstruction enhances temporal resolution and reduces radiation dose, yet its clinical use is hindered by artifacts.<n>We propose a Cross-Distribution Diffusion Priors-Driven Iterative Reconstruction framework to tackle the OOD problem in SVCT.
arXiv Detail & Related papers (2025-09-16T22:35:13Z)
ResPF: Residual Poisson Flow for Efficient and Physically Consistent Sparse-View CT Reconstruction [7.644299873269135]
Sparse-view computed tomography (CT) is a practical solution to reduce radiation dose, but the resulting inverse problem poses significant challenges for accurate image reconstruction.<n>Recent advances in generative modeling, particularly Poisson Flow Generative Models (PFGM), enable high-fidelity image synthesis.<n>We propose Residual Poisson Flow (ResPF) Generative Models for efficient and accurate sparse-view CT reconstruction.
arXiv Detail & Related papers (2025-06-06T01:43:35Z)
Global-to-Local Modeling for Video-based 3D Human Pose and Shape Estimation [53.04781510348416]
Video-based 3D human pose and shape estimations are evaluated by intra-frame accuracy and inter-frame smoothness. We propose to structurally decouple the modeling of long-term and short-term correlations in an end-to-end framework, Global-to-Local Transformer (GLoT) Our GLoT surpasses previous state-of-the-art methods with the lowest model parameters on popular benchmarks, i.e., 3DPW, MPI-INF-3DHP, and Human3.6M.
arXiv Detail & Related papers (2023-03-26T14:57:49Z)
PyMAF-X: Towards Well-aligned Full-body Model Regression from Monocular Images [60.33197938330409]
PyMAF-X is a regression-based approach to recovering parametric full-body models from monocular images. PyMAF and PyMAF-X effectively improve the mesh-image alignment and achieve new state-of-the-art results.
arXiv Detail & Related papers (2022-07-13T17:58:33Z)
Poseur: Direct Human Pose Regression with Transformers [119.79232258661995]
We propose a direct, regression-based approach to 2D human pose estimation from single images. Our framework is end-to-end differentiable, and naturally learns to exploit the dependencies between keypoints. Ours is the first regression-based approach to perform favorably compared to the best heatmap-based pose estimation methods.
arXiv Detail & Related papers (2022-01-19T04:31:57Z)
Homography Decomposition Networks for Planar Object Tracking [11.558401177707312]
Planar object tracking plays an important role in AI applications, such as robotics, visual servoing, and visual SLAM. We propose a novel Homography Decomposition Networks(HDN) approach that drastically reduces and stabilizes the condition number by decomposing the homography transformation into two groups.
arXiv Detail & Related papers (2021-12-15T06:13:32Z)
TFPose: Direct Human Pose Estimation with Transformers [83.03424247905869]
We formulate the pose estimation task into a sequence prediction problem that can effectively be solved by transformers. Our framework is simple and direct, bypassing the drawbacks of the heatmap-based pose estimation. Experiments on the MS-COCO and MPII datasets demonstrate that our method can significantly improve the state-of-the-art of regression-based pose estimation.
arXiv Detail & Related papers (2021-03-29T04:18:54Z)
FPCR-Net: Feature Pyramidal Correlation and Residual Reconstruction for Optical Flow Estimation [72.41370576242116]
We propose a semi-supervised Feature Pyramidal Correlation and Residual Reconstruction Network (FPCR-Net) for optical flow estimation from frame pairs. It consists of two main modules: pyramid correlation mapping and residual reconstruction. Experiment results show that the proposed scheme achieves the state-of-the-art performance, with improvement by 0.80, 1.15 and 0.10 in terms of average end-point error (AEE) against competing baseline methods.
arXiv Detail & Related papers (2020-01-17T07:13:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.