Related papers: Enriching Information and Preserving Semantic Consistency in Expanding Curvilinear Object Segmentation Datasets

Enriching Information and Preserving Semantic Consistency in Expanding Curvilinear Object Segmentation Datasets

URL: http://arxiv.org/abs/2407.08209v1
Date: Thu, 11 Jul 2024 06:25:26 GMT
Title: Enriching Information and Preserving Semantic Consistency in Expanding Curvilinear Object Segmentation Datasets
Authors: Qin Lei, Jiang Zhong, Qizhu Dai,
Abstract summary: Curvilinear object segmentation plays a crucial role across various applications, yet datasets in this domain often suffer from small scale. This paper introduces a novel approach for expanding curvilinear object segmentation datasets. Our method enriches synthetic data informativeness by generating curvilinear objects through their multiple textual features.
Score: 1.0104586293349587
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Curvilinear object segmentation plays a crucial role across various applications, yet datasets in this domain often suffer from small scale due to the high costs associated with data acquisition and annotation. To address these challenges, this paper introduces a novel approach for expanding curvilinear object segmentation datasets, focusing on enhancing the informativeness of generated data and the consistency between semantic maps and generated images. Our method enriches synthetic data informativeness by generating curvilinear objects through their multiple textual features. By combining textual features from each sample in original dataset, we obtain synthetic images that beyond the original dataset's distribution. This initiative necessitated the creation of the Curvilinear Object Segmentation based on Text Generation (COSTG) dataset. Designed to surpass the limitations of conventional datasets, COSTG incorporates not only standard semantic maps but also some textual descriptions of curvilinear object features. To ensure consistency between synthetic semantic maps and images, we introduce the Semantic Consistency Preserving ControlNet (SCP ControlNet). This involves an adaptation of ControlNet with Spatially-Adaptive Normalization (SPADE), allowing it to preserve semantic information that would typically be washed away in normalization layers. This modification facilitates more accurate semantic image synthesis. Experimental results demonstrate the efficacy of our approach across three types of curvilinear objects (angiography, crack and retina) and six public datasets (CHUAC, XCAD, DCA1, DRIVE, CHASEDB1 and Crack500). The synthetic data generated by our method not only expand the dataset, but also effectively improves the performance of other curvilinear object segmentation models. Source code and dataset are available at \url{https://github.com/tanlei0/COSTG}.

Related papers

GeoGNN: Quantifying and Mitigating Semantic Drift in Text-Attributed Graphs [59.61242815508687]
Graph neural networks (GNNs) on text--attributed graphs (TAGs) encode node texts using pretrained language models (PLMs) and propagate these embeddings through linear neighborhood aggregation.<n>This work introduces a local PCA-based metric that measures the degree of semantic drift and provides the first quantitative framework to analyze how different aggregation mechanisms affect manifold structure.
arXiv Detail & Related papers (2025-11-12T06:48:43Z)
UrbanTwin: Synthetic LiDAR Datasets (LUMPI, V2X-Real-IC, and TUMTraf-I) [3.1508266388327324]
UrbanTwin datasets are high-fidelity, realistic replicas of three public roadside lidar datasets.<n>Each UrbanTwin dataset contains 10K frames corresponding to one of the public datasets.
arXiv Detail & Related papers (2025-09-08T15:06:02Z)
SPIRAL: Semantic-Aware Progressive LiDAR Scene Generation [10.77777607732642]
Spiral is a novel range-view LiDAR diffusion model that simultaneously generates depth, reflectance images, and semantic maps.<n> Experiments on the Semantic KITTI and nuScenes datasets demonstrate that Spiral achieves state-of-the-art performance with the smallest parameter size.
arXiv Detail & Related papers (2025-05-28T17:55:35Z)
Towards Generating Realistic 3D Semantic Training Data for Autonomous Driving [27.088907562842902]
In autonomous driving, 3D semantic segmentation plays an important role for enabling safe navigation. The complexity of collecting and annotating 3D data is a bottleneck in this developments. We propose a novel approach able to generate 3D semantic scene-scale data without relying on any projection or decoupled trained multi-resolution models.
arXiv Detail & Related papers (2025-03-27T12:41:42Z)
Towards Natural Image Matting in the Wild via Real-Scenario Prior [69.96414467916863]
We propose a new matting dataset based on the COCO dataset, namely COCO-Matting. The built COCO-Matting comprises an extensive collection of 38,251 human instance-level alpha mattes in complex natural scenarios. For network architecture, the proposed feature-aligned transformer learns to extract fine-grained edge and transparency features. The proposed matte-aligned decoder aims to segment matting-specific objects and convert coarse masks into high-precision mattes.
arXiv Detail & Related papers (2024-10-09T06:43:19Z)
Optimizing against Infeasible Inclusions from Data for Semantic Segmentation through Morphology [58.17907376475596]
State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion.<n>InSeIn extracts explicit inclusion constraints that govern spatial class relations from the semantic segmentation training set at hand.<n>It then enforces a morphological yet differentiable loss that penalizes violations of these constraints during training to promote prediction feasibility.
arXiv Detail & Related papers (2024-08-26T22:39:08Z)
Entropic Optimal Transport Eigenmaps for Nonlinear Alignment and Joint Embedding of High-Dimensional Datasets [11.105392318582677]
We propose a principled approach for aligning and jointly embedding a pair of datasets with theoretical guarantees. Our approach leverages the leading singular vectors of the EOT plan matrix between two datasets to extract their shared underlying structure. We show that in a high-dimensional regime, the EOT plan recovers the shared manifold structure by approximating a kernel function evaluated at the locations of the latent variables.
arXiv Detail & Related papers (2024-07-01T18:48:55Z)
Hierarchical Features Matter: A Deep Exploration of GAN Priors for Improved Dataset Distillation [51.44054828384487]
We propose a novel parameterization method dubbed Hierarchical Generative Latent Distillation (H-GLaD) This method systematically explores hierarchical layers within the generative adversarial networks (GANs) In addition, we introduce a novel class-relevant feature distance metric to alleviate the computational burden associated with synthetic dataset evaluation.
arXiv Detail & Related papers (2024-06-09T09:15:54Z)
Modified CycleGAN for the synthesization of samples for wheat head segmentation [0.09999629695552192]
In the absence of an annotated dataset, synthetic data can be used for model development. We develop a realistic annotated synthetic dataset for wheat head segmentation. The resulting model achieved a Dice score of 83.4% on an internal dataset and 83.6% on two external Global Wheat Head Detection datasets.
arXiv Detail & Related papers (2024-02-23T06:42:58Z)
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models [61.906934570771256]
We present a generic dataset generation model that can produce diverse synthetic images and perception annotations. Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation. We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module.
arXiv Detail & Related papers (2023-08-11T14:38:11Z)
FreeCOS: Self-Supervised Learning from Fractals and Unlabeled Images for Curvilinear Object Segmentation [7.078356641689271]
This paper proposes a self-supervised curvilinear object segmentation method that learns robust and distinctive features from fractals and unlabeled images. The key contributions include a novel Fractal-FDA synthesis (FFS) module and a geometric information alignment (GIA) approach. GIA reduces the intensity differences between the synthetic and unlabeled images by comparing the intensity order of a given pixel to the values of its nearby neighbors.
arXiv Detail & Related papers (2023-07-14T09:38:08Z)
Synthetic-to-Real Domain Generalized Semantic Segmentation for 3D Indoor Point Clouds [69.64240235315864]
This paper introduces the synthetic-to-real domain generalization setting to this task. The domain gap between synthetic and real-world point cloud data mainly lies in the different layouts and point patterns. Experiments on the synthetic-to-real benchmark demonstrate that both CINMix and multi-prototypes can narrow the distribution gap.
arXiv Detail & Related papers (2022-12-09T05:07:43Z)
GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation [60.07812405063708]
3D point cloud semantic segmentation is fundamental for autonomous driving. Most approaches in the literature neglect an important aspect, i.e., how to deal with domain shift when handling dynamic scenes. This paper advances the state of the art in this research field.
arXiv Detail & Related papers (2022-07-20T09:06:07Z)
BGT-Net: Bidirectional GRU Transformer Network for Scene Graph Generation [0.15469452301122172]
Scene graph generation (SGG) aims to identify the objects and their relationships. We propose a bidirectional GRU (BiGRU) transformer network (BGT-Net) for the scene graph generation for images. This model implements novel object-object communication to enhance the object information using a BiGRU layer.
arXiv Detail & Related papers (2021-09-11T19:14:40Z)
SynLiDAR: Learning From Synthetic LiDAR Sequential Point Cloud for Semantic Segmentation [37.00112978096702]
SynLiDAR is a synthetic LiDAR point cloud dataset with accurate geometric shapes and comprehensive semantic classes. PCT-Net is a point cloud translation network that aims to narrow down the gap with real-world point cloud data. Experiments over multiple data augmentation and semi-supervised semantic segmentation tasks show very positive outcomes.
arXiv Detail & Related papers (2021-07-12T12:51:08Z)
Polygonal Point Set Tracking [50.445151155209246]
We propose a novel learning-based polygonal point set tracking method. Our goal is to track corresponding points on the target contour. We present visual-effects applications of our method on part distortion and text mapping.
arXiv Detail & Related papers (2021-05-30T17:12:36Z)
Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets. This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets. We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.