Related papers: Unsupervised Discovery of 3D Hierarchical Structure with Generative Diffusion Features

Unsupervised Discovery of 3D Hierarchical Structure with Generative Diffusion Features

URL: http://arxiv.org/abs/2305.00067v2
Date: Tue, 10 Oct 2023 04:01:38 GMT
Title: Unsupervised Discovery of 3D Hierarchical Structure with Generative Diffusion Features
Authors: Nurislam Tursynbek, Marc Niethammer
Abstract summary: We show that features of diffusion models capture different hierarchy levels in 3D biomedical images. We train a predictive unsupervised segmentation network that encourages the decomposition of 3D volumes into meaningful nested subvolumes. Our models achieve better performance than prior unsupervised structure discovery approaches on challenging synthetic datasets and on a real-world brain tumor MRI dataset.
Score: 22.657405088126012
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Inspired by recent findings that generative diffusion models learn semantically meaningful representations, we use them to discover the intrinsic hierarchical structure in biomedical 3D images using unsupervised segmentation. We show that features of diffusion models from different stages of a U-Net-based ladder-like architecture capture different hierarchy levels in 3D biomedical images. We design three losses to train a predictive unsupervised segmentation network that encourages the decomposition of 3D volumes into meaningful nested subvolumes that represent a hierarchy. First, we pretrain 3D diffusion models and use the consistency of their features across subvolumes. Second, we use the visual consistency between subvolumes. Third, we use the invariance to photometric augmentations as a regularizer. Our models achieve better performance than prior unsupervised structure discovery approaches on challenging biologically-inspired synthetic datasets and on a real-world brain tumor MRI dataset.

Related papers

Vector Representations of Vessel Trees [12.391128284848135]
We introduce a novel framework for learning vector representations of tree-structured geometric data focusing on 3D vascular networks.<n>Our framework, named VeTTA, offers precise, flexible, and topologically consistent modeling of anatomical tree structures in medical imaging.
arXiv Detail & Related papers (2025-06-11T20:34:08Z)
Interpretable Single-View 3D Gaussian Splatting using Unsupervised Hierarchical Disentangled Representation Learning [46.85417907244265]
We propose an interpretable single-view 3DGS framework, termed 3DisGS, to discover both coarse- and fine-grained 3D semantics. Our model achieves 3D disentanglement while preserving high-quality and rapid reconstruction.
arXiv Detail & Related papers (2025-04-05T14:42:13Z)
DSplats: 3D Generation by Denoising Splats-Based Multiview Diffusion Models [67.50989119438508]
We introduce DSplats, a novel method that directly denoises multiview images using Gaussian-based Reconstructors to produce realistic 3D assets. Our experiments demonstrate that DSplats not only produces high-quality, spatially consistent outputs, but also sets a new standard in single-image to 3D reconstruction.
arXiv Detail & Related papers (2024-12-11T07:32:17Z)
A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision [65.33043028101471]
We introduce a diffusion model for Gaussian Splats, SplatDiffusion, to enable generation of three-dimensional structures from single images. Existing methods rely on deterministic, feed-forward predictions, which limit their ability to handle the inherent ambiguity of 3D inference from 2D data.
arXiv Detail & Related papers (2024-12-01T00:29:57Z)
DiHuR: Diffusion-Guided Generalizable Human Reconstruction [51.31232435994026]
We introduce DiHuR, a Diffusion-guided model for generalizable Human 3D Reconstruction and view synthesis from sparse, minimally overlapping images. Our method integrates two key priors in a coherent manner: the prior from generalizable feed-forward models and the 2D diffusion prior, and it requires only multi-view image training, without 3D supervision.
arXiv Detail & Related papers (2024-11-16T03:52:23Z)
Large Spatial Model: End-to-end Unposed Images to Semantic 3D [79.94479633598102]
Large Spatial Model (LSM) processes unposed RGB images directly into semantic radiance fields. LSM simultaneously estimates geometry, appearance, and semantics in a single feed-forward operation. It can generate versatile label maps by interacting with language at novel viewpoints.
arXiv Detail & Related papers (2024-10-24T17:54:42Z)
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images. We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage. We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z)
A Generative Machine Learning Model for Material Microstructure 3D Reconstruction and Performance Evaluation [4.169915659794567]
The dimensional extension from 2D to 3D is viewed as a highly challenging inverse problem from the current technological perspective. A novel generative model that integrates the multiscale properties of U-net with and the generative capabilities of GAN has been proposed. The model's accuracy is further improved by combining the image regularization loss with the Wasserstein distance loss.
arXiv Detail & Related papers (2024-02-24T13:42:34Z)
Self-Supervised Representation Learning for Nerve Fiber Distribution Patterns in 3D-PLI [36.136619420474766]
3D-PLI is a microscopic imaging technique that enables insights into the fine-grained organization of myelinated nerve fibers with high resolution. Best practices for observer-independent characterization of fiber architecture in 3D-PLI are not yet available. We propose the application of a fully data-driven approach to characterize nerve fiber architecture in 3D-PLI images using self-supervised representation learning.
arXiv Detail & Related papers (2024-01-30T17:49:53Z)
StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D [88.66678730537777]
We present StableDreamer, a methodology incorporating three advances. First, we formalize the equivalence of the SDS generative prior and a simple supervised L2 reconstruction loss. Second, our analysis shows that while image-space diffusion contributes to geometric precision, latent-space diffusion is crucial for vivid color rendition.
arXiv Detail & Related papers (2023-12-02T02:27:58Z)
Lymphoma segmentation from 3D PET-CT images using a deep evidential network [20.65641432056608]
An automatic evidential segmentation method is proposed to segment lymphomas from 3D Positron Emission Tomography (PET) and Computed Tomography (CT) images. The architecture is composed of a deep feature-extraction module and an evidential layer. The proposed combination of deep feature extraction and evidential segmentation is shown to outperform the baseline UNet model.
arXiv Detail & Related papers (2022-01-31T09:34:38Z)
Hierarchical Graph Networks for 3D Human Pose Estimation [50.600944798627786]
Recent 2D-to-3D human pose estimation works tend to utilize the graph structure formed by the topology of the human skeleton. We argue that this skeletal topology is too sparse to reflect the body structure and suffer from serious 2D-to-3D ambiguity problem. We propose a novel graph convolution network architecture, Hierarchical Graph Networks, to overcome these weaknesses.
arXiv Detail & Related papers (2021-11-23T15:09:03Z)
Learning Hyperbolic Representations for Unsupervised 3D Segmentation [3.516233423854171]
We propose learning effective representations of 3D patches for unsupervised segmentation through a variational autoencoder (VAE) with a hyperbolic latent space and a proposed gyroplane convolutional layer. We demonstrate the effectiveness of our hyperbolic representations for unsupervised 3D segmentation on a hierarchical toy dataset, BraTS whole tumor dataset, and cryogenic electron microscopy data.
arXiv Detail & Related papers (2020-12-03T02:15:31Z)
Hierarchical Amortized Training for Memory-efficient High Resolution 3D GAN [52.851990439671475]
We propose a novel end-to-end GAN architecture that can generate high-resolution 3D images. We achieve this goal by using different configurations between training and inference. Experiments on 3D thorax CT and brain MRI demonstrate that our approach outperforms state of the art in image generation.
arXiv Detail & Related papers (2020-08-05T02:33:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.