MIP Candy: A Modular PyTorch Framework for Medical Image Processing
- URL: http://arxiv.org/abs/2602.21033v1
- Date: Tue, 24 Feb 2026 15:55:04 GMT
- Title: MIP Candy: A Modular PyTorch Framework for Medical Image Processing
- Authors: Tianhao Fu, Yucheng Chen,
- Abstract summary: MIPCandy is a PyTorch-based framework designed specifically for medical image processing.<n>Central to the design is $textttLayerT$, a deferred configuration mechanism that enables runtime substitution of convolution, normalization, and activation modules.<n>MIPCandy is open-source under the Apache-2.0 license and requires Python3.12 or later.
- Score: 4.024630879799288
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Medical image processing demands specialized software that handles high-dimensional volumetric data, heterogeneous file formats, and domain-specific training procedures. Existing frameworks either provide low-level components that require substantial integration effort or impose rigid, monolithic pipelines that resist modification. We present MIP Candy (MIPCandy), a freely available, PyTorch-based framework designed specifically for medical image processing. MIPCandy provides a complete, modular pipeline spanning data loading, training, inference, and evaluation, allowing researchers to obtain a fully functional process workflow by implementing a single method, $\texttt{build_network}$, while retaining fine-grained control over every component. Central to the design is $\texttt{LayerT}$, a deferred configuration mechanism that enables runtime substitution of convolution, normalization, and activation modules without subclassing. The framework further offers built-in $k$-fold cross-validation, dataset inspection with automatic region-of-interest detection, deep supervision, exponential moving average, multi-frontend experiment tracking (Weights & Biases, Notion, MLflow), training state recovery, and validation score prediction via quotient regression. An extensible bundle ecosystem provides pre-built model implementations that follow a consistent trainer--predictor pattern and integrate with the core framework without modification. MIPCandy is open-source under the Apache-2.0 license and requires Python~3.12 or later. Source code and documentation are available at https://github.com/ProjectNeura/MIPCandy.
Related papers
- TokaMind: A Multi-Modal Transformer Foundation Model for Tokamak Plasma Dynamics [56.073642366268764]
TokaMind is an open-source foundation model framework for fusion plasma modeling.<n>It is trained on heterogeneous tokamak diagnostics from the publicly available MAST dataset.<n>We evaluate TokaMind on the recently introduced MAST benchmark TokaMark.
arXiv Detail & Related papers (2026-02-16T12:26:07Z) - VibeTensor: System Software for Deep Learning, Fully Generated by AI Agents [42.56489784841984]
"fully generated" refers to code provenance: implementation changes were produced and applied as agent-proposed diffs.<n>We describe the architecture, summarize the workflow used to produce and validate the system, and evaluate the artifact.
arXiv Detail & Related papers (2026-01-21T19:29:00Z) - Plug-and-Play Benchmarking of Reinforcement Learning Algorithms for Large-Scale Flow Control [61.155940786140455]
Reinforcement learning (RL) has shown promising results in active flow control (AFC)<n>Current AFC benchmarks rely on external computational fluid dynamics (CFD) solvers, are not fully differentiable, and provide limited 3D and multi-agent support.<n>We introduce FluidGym, the first standalone, fully differentiable benchmark suite for RL in AFC.
arXiv Detail & Related papers (2026-01-21T14:13:44Z) - Stroke Lesion Segmentation in Clinical Workflows: A Modular, Lightweight, and Deployment-Ready Tool [0.08699280339422537]
Deep learning frameworks such as nnU-Net achieve state-of-the-art performance in brain lesion segmentation but remain difficult to deploy clinically.<n>We introduce textitStrokeSeg, a modular and lightweight framework that translates research-grade stroke lesion segmentation models into deployable applications.
arXiv Detail & Related papers (2025-10-28T12:56:48Z) - MMORE: Massive Multimodal Open RAG & Extraction [35.45122798365231]
MMORE is a pipeline to ingest, transform, and retrieve knowledge from heterogeneous document formats at scale.<n>MMORE supports more than fifteen file types, including text, tables, images, emails, audio, and video, and processes them into a unified format.<n>On processing benchmarks, MMORE demonstrates a 3.8-fold speedup over single-node baselines and 40% higher accuracy than Docling on scanned PDFs.
arXiv Detail & Related papers (2025-09-15T13:56:06Z) - pyFAST: A Modular PyTorch Framework for Time Series Modeling with Multi-source and Sparse Data [10.949140998070732]
pyFAST is a research-oriented PyTorch framework for time series analysis.<n>Its data engine is engineered for complex scenarios, supporting multi-source loading, protein sequence handling, efficient sequence- and patch-level padding, dynamic normalization, and mask-based modeling.<n>Released under the MIT license at GitHub, pyFAST provides a compact yet powerful platform for advancing time series research and applications.
arXiv Detail & Related papers (2025-08-26T10:05:47Z) - pyvene: A Library for Understanding and Improving PyTorch Models via
Interventions [79.72930339711478]
$textbfpyvene$ is an open-source library that supports customizable interventions on a range of different PyTorch modules.
We show how $textbfpyvene$ provides a unified framework for performing interventions on neural models and sharing the intervened upon models with others.
arXiv Detail & Related papers (2024-03-12T16:46:54Z) - CMFDFormer: Transformer-based Copy-Move Forgery Detection with Continual
Learning [52.72888626663642]
Copy-move forgery detection aims at detecting duplicated regions in a suspected forged image.
Deep learning based copy-move forgery detection methods are in the ascendant.
We propose a Transformer-style copy-move forgery network named as CMFDFormer.
We also provide a novel PCSD continual learning framework to help CMFDFormer handle new tasks.
arXiv Detail & Related papers (2023-11-22T09:27:46Z) - MatFormer: Nested Transformer for Elastic Inference [91.45687988953435]
MatFormer is a novel Transformer architecture designed to provide elastic inference across diverse deployment constraints.<n>MatFormer achieves this by incorporating a nested Feed Forward Network (FFN) block structure within a standard Transformer model.<n>We show that a 850M decoder-only MatFormer language model (MatLM) allows us to extract multiple smaller models spanning from 582M to 850M parameters.
arXiv Detail & Related papers (2023-10-11T17:57:14Z) - BatchFormerV2: Exploring Sample Relationships for Dense Representation
Learning [88.82371069668147]
BatchFormerV2 is a more general batch Transformer module, which enables exploring sample relationships for dense representation learning.
BatchFormerV2 consistently improves current DETR-based detection methods by over 1.3%.
arXiv Detail & Related papers (2022-04-04T05:53:42Z) - Scanflow: A multi-graph framework for Machine Learning workflow
management, supervision, and debugging [0.0]
We propose a novel containerized directed graph framework to support end-to-end Machine Learning workflow management.
The framework allows defining and deploying ML in containers, tracking their metadata, checking their behavior in production, and improving the models by using both learned and human-provided knowledge.
arXiv Detail & Related papers (2021-11-04T17:01:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.