Related papers: Diffusion Model-Based Data Augmentation for Enhanced Neuron Segmentation

Diffusion Model-Based Data Augmentation for Enhanced Neuron Segmentation

URL: http://arxiv.org/abs/2601.15779v1
Date: Thu, 22 Jan 2026 09:12:05 GMT
Title: Diffusion Model-Based Data Augmentation for Enhanced Neuron Segmentation
Authors: Liuyun Jiang, Yanchao Zhang, Jinyue Guo, Yizhuo Lu, Ruining Zhou, Hua Han,
Abstract summary: Current deep learning-based methods are limited by their reliance on large-scale training data and extensive, time-consuming manual annotations.<n>We propose a diffusion-based data augmentation framework capable of generating diverse and structurally plausible image-label pairs.<n>Our method improves the ARAND metric by 32.1% and 30.7%, respectively, when combined with two different post-processing methods.
Score: 7.83854380301501
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neuron segmentation in electron microscopy (EM) aims to reconstruct the complete neuronal connectome; however, current deep learning-based methods are limited by their reliance on large-scale training data and extensive, time-consuming manual annotations. Traditional methods augment the training set through geometric and photometric transformations; however, the generated samples remain highly correlated with the original images and lack structural diversity. To address this limitation, we propose a diffusion-based data augmentation framework capable of generating diverse and structurally plausible image-label pairs for neuron segmentation. Specifically, the framework employs a resolution-aware conditional diffusion model with multi-scale conditioning and EM resolution priors to enable voxel-level image synthesis from 3D masks. It further incorporates a biology-guided mask remodeling module that produces augmented masks with enhanced structural realism. Together, these components effectively enrich the training set and improve segmentation performance. On the AC3 and AC4 datasets under low-annotation regimes, our method improves the ARAND metric by 32.1% and 30.7%, respectively, when combined with two different post-processing methods. Our code is available at https://github.com/HeadLiuYun/NeuroDiff.

Related papers

Advanced Geometric Correction Algorithms for 3D Medical Reconstruction: Comparison of Computed Tomography and Macroscopic Imaging [0.9395222766576343]
This paper introduces a hybrid two-stage registration framework for reconstructing 3D kidney anatomy from macroscopic slices.<n>It addresses the data-scarcity and high-distortion challenges typical of macroscopic imaging.<n>The proposed framework generalizes to other soft-tissue organs reconstructed from optical or photographic cross-sections.
arXiv Detail & Related papers (2026-01-30T17:16:17Z)
Multimodal Visual Surrogate Compression for Alzheimer's Disease Classification [69.87877580725768]
Multimodal Visual Surrogate Compression (MVSC) learns to compress and adapt large 3D sMRI volumes into compact 2D features.<n>MVSC has two key components: a Volume Context that captures global cross-slice context under textual guidance, and an Adaptive Slice Fusion module that aggregates slice-level information in a text-enhanced, patch-wise manner.
arXiv Detail & Related papers (2026-01-29T13:05:46Z)
A New One-Shot Federated Learning Framework for Medical Imaging Classification with Feature-Guided Rectified Flow and Knowledge Distillation [13.353672721534627]
One-Shot Federated Learning (OSFL) has attracted increasing attention due to its low communication overhead.<n>Existing generative model-based OSFL methods suffer from low training efficiency and potential privacy leakage in the healthcare domain.<n>In this paper a modified OSFL framework is proposed, in which a new Feature-Guided Rectified Flow Model (FG-RF) and Dual-Layer Knowledge Distillation (DLKD) aggregation method are developed.
arXiv Detail & Related papers (2025-07-25T08:05:47Z)
Improving Progressive Generation with Decomposable Flow Matching [50.63174319509629]
Decomposable Flow Matching (DFM) is a simple and effective framework for the progressive generation of visual media.<n>On Imagenet-1k 512px, DFM achieves 35.2% improvements in FDD scores over the base architecture and 26.4% over the best-performing baseline.
arXiv Detail & Related papers (2025-06-24T17:58:02Z)
ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image Segmentation [49.42525661521625]
This paper presents ShapeMamba-EM, a specialized fine-tuning method for 3D EM segmentation. It is tested over a wide range of EM images, covering five segmentation tasks and 10 datasets.
arXiv Detail & Related papers (2024-08-26T08:59:22Z)
Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes [2.8498944632323755]
We propose an end-to-end hybrid architecture for medical image segmentation. We use Hamiltonian Variational Autoencoders (HVAE) and a discriminative regularization to improve the quality of generated images. Our architecture operates on a slice-by-slice basis to segment 3D volumes, capitilizing on the richly augmented dataset.
arXiv Detail & Related papers (2024-06-17T15:42:08Z)
3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce Regimes [2.8498944632323755]
We propose a novel slice-based latent diffusion architecture to address the complexities of volumetric data generation. This approach extends the joint distribution modeling of medical images and their associated masks, allowing a simultaneous generation of both under data-scarce regimes. Our architecture can be conditioned by tumor characteristics, including size, shape, and relative position, thereby providing a diverse range of tumor variations.
arXiv Detail & Related papers (2024-06-08T09:53:45Z)
NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation [55.51412454263856]
This paper proposes to directly modulate the generation process of diffusion models using fMRI signals. By training with about 67,000 fMRI-image pairs from various individuals, our model enjoys superior fMRI-to-image decoding capacity.
arXiv Detail & Related papers (2024-03-27T02:42:52Z)
HybridMIM: A Hybrid Masked Image Modeling Framework for 3D Medical Image Segmentation [29.15746532186427]
HybridMIM is a novel hybrid self-supervised learning method based on masked image modeling for 3D medical image segmentation. We learn the semantic information of medical images at three levels, including:1) partial region prediction to reconstruct key contents of the 3D image, which largely reduces the pre-training time burden. The proposed framework is versatile to support both CNN and transformer as encoder backbones, and also enables to pre-train decoders for image segmentation.
arXiv Detail & Related papers (2023-03-18T04:43:12Z)
Automatic size and pose homogenization with spatial transformer network to improve and accelerate pediatric segmentation [51.916106055115755]
We propose a new CNN architecture that is pose and scale invariant thanks to the use of Spatial Transformer Network (STN) Our architecture is composed of three sequential modules that are estimated together during training. We test the proposed method in kidney and renal tumor segmentation on abdominal pediatric CT scanners.
arXiv Detail & Related papers (2021-07-06T14:50:03Z)
Learning Deformable Image Registration from Optimization: Perspective, Modules, Bilevel Training and Beyond [62.730497582218284]
We develop a new deep learning based framework to optimize a diffeomorphic model via multi-scale propagation. We conduct two groups of image registration experiments on 3D volume datasets including image-to-atlas registration on brain MRI data and image-to-image registration on liver CT data.
arXiv Detail & Related papers (2020-04-30T03:23:45Z)
Brain segmentation based on multi-atlas guided 3D fully convolutional network ensembles [1.52292571922932]
We propose and validated a multi-atlas guided 3D fully convolutional network (FCN) ensemble model (M-FCN) for segmenting brain regions of interest (ROIs) from structural magnetic resonance images (MRIs) We trained a 3D FCN model for each ROI using patches of adaptive size and embedded outputs of the convolutional layers in the deconvolutional layers to further capture the local and global context patterns. Our results suggested that the proposed method had a superior segmentation performance.
arXiv Detail & Related papers (2019-01-05T08:23:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.