Related papers: DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection

DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection

URL: http://arxiv.org/abs/2401.02032v2
Date: Tue, 9 Jan 2024 12:00:35 GMT
Title: DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge Detection
Authors: Yunfan Ye, Kai Xu, Yuhang Huang, Renjiao Yi, Zhiping Cai
Abstract summary: We propose the first diffusion model for the task of general edge detection, which we call DiffusionEdge. To avoid expensive computational resources while retaining the final performance, we apply DPM in the latent space and enable the classic cross-entropy loss. With all the technical designs, DiffusionEdge can be stably trained with limited resources, predicting crisp and accurate edge maps with much fewer augmentation strategies.
Score: 20.278655159290302
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Limited by the encoder-decoder architecture, learning-based edge detectors usually have difficulty predicting edge maps that satisfy both correctness and crispness. With the recent success of the diffusion probabilistic model (DPM), we found it is especially suitable for accurate and crisp edge detection since the denoising process is directly applied to the original image size. Therefore, we propose the first diffusion model for the task of general edge detection, which we call DiffusionEdge. To avoid expensive computational resources while retaining the final performance, we apply DPM in the latent space and enable the classic cross-entropy loss which is uncertainty-aware in pixel level to directly optimize the parameters in latent space in a distillation manner. We also adopt a decoupled architecture to speed up the denoising process and propose a corresponding adaptive Fourier filter to adjust the latent features of specific frequencies. With all the technical designs, DiffusionEdge can be stably trained with limited resources, predicting crisp and accurate edge maps with much fewer augmentation strategies. Extensive experiments on four edge detection benchmarks demonstrate the superiority of DiffusionEdge both in correctness and crispness. On the NYUDv2 dataset, compared to the second best, we increase the ODS, OIS (without post-processing) and AC by 30.2%, 28.1% and 65.1%, respectively. Code: https://github.com/GuHuangAI/DiffusionEdge.

Related papers

Robust Representation Consistency Model via Contrastive Denoising [83.47584074390842]
randomized smoothing provides theoretical guarantees for certifying robustness against adversarial perturbations. diffusion models have been successfully employed for randomized smoothing to purify noise-perturbed samples. We reformulate the generative modeling task along the diffusion trajectories in pixel space as a discriminative task in the latent space.
arXiv Detail & Related papers (2025-01-22T18:52:06Z)
DPBridge: Latent Diffusion Bridge for Dense Prediction [49.1574468325115]
We introduce DPBridge, the first latent diffusion bridge framework for dense prediction tasks.<n>Our method consistently achieves superior performance, demonstrating its effectiveness and capability generalization under different scenarios.
arXiv Detail & Related papers (2024-12-29T15:50:34Z)
Generative Edge Detection with Stable Diffusion [52.870631376660924]
Edge detection is typically viewed as a pixel-level classification problem mainly addressed by discriminative methods. We propose a novel approach, named Generative Edge Detector (GED), by fully utilizing the potential of the pre-trained stable diffusion model. We conduct extensive experiments on multiple datasets and achieve competitive performance.
arXiv Detail & Related papers (2024-10-04T01:52:23Z)
UDHF2-Net: Uncertainty-diffusion-model-based High-Frequency TransFormer Network for Remotely Sensed Imagery Interpretation [12.24506241611653]
Uncertainty-diffusion-model-based high-Frequency TransFormer network (UDHF2-Net) is the first to be proposed. UDHF2-Net is a spatially-stationary-and-non-stationary high-frequency connection paradigm (SHCP) Mask-and-geo-knowledge-based uncertainty diffusion module (MUDM) is a self-supervised learning strategy. A frequency-wise semi-pseudo-Siamese UDHF2-Net is the first to be proposed to balance accuracy and complexity for change detection.
arXiv Detail & Related papers (2024-06-23T15:03:35Z)
Learning to utilize image second-order derivative information for crisp edge detection [13.848361661516595]
Edge detection is a fundamental task in computer vision. Recent top-performing edge detection methods tend to generate thick and noisy edge lines. We propose a second-order derivative-based multi-scale contextual enhancement module (SDMCM) to help the model locate true edge pixels accurately. We also construct a hybrid focal loss function (HFL) to alleviate the imbalanced distribution issue. In the end, we propose a U-shape network named LUS-Net which is based on the SDMCM and BRM for edge detection.
arXiv Detail & Related papers (2024-06-09T13:25:02Z)
DiffusionPCR: Diffusion Models for Robust Multi-Step Point Cloud Registration [73.37538551605712]
Point Cloud Registration (PCR) estimates the relative rigid transformation between two point clouds. We propose formulating PCR as a denoising diffusion probabilistic process, mapping noisy transformations to the ground truth. Our experiments showcase the effectiveness of our DiffusionPCR, yielding state-of-the-art registration recall rates (95.3%/81.6%) on 3D and 3DLoMatch.
arXiv Detail & Related papers (2023-12-05T18:59:41Z)
Detecting Rotated Objects as Gaussian Distributions and Its 3-D Generalization [81.29406957201458]
Existing detection methods commonly use a parameterized bounding box (BBox) to model and detect (horizontal) objects. We argue that such a mechanism has fundamental limitations in building an effective regression loss for rotation detection. We propose to model the rotated objects as Gaussian distributions. We extend our approach from 2-D to 3-D with a tailored algorithm design to handle the heading estimation.
arXiv Detail & Related papers (2022-09-22T07:50:48Z)
EResFD: Rediscovery of the Effectiveness of Standard Convolution for Lightweight Face Detection [13.357235715178584]
We re-examine the effectiveness of the standard convolutional block as a lightweight backbone architecture for face detection. We show that heavily channel-pruned standard convolution layers can achieve better accuracy and inference speed. Our proposed detector EResFD obtained 80.4% mAP on WIDER FACE Hard subset which only takes 37.7 ms for VGA image inference on CPU.
arXiv Detail & Related papers (2022-04-04T02:30:43Z)
The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss. Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU. The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z)
FOVEA: Foveated Image Magnification for Autonomous Navigation [53.69803081925454]
We propose an attentional approach that elastically magnifies certain regions while maintaining a small input canvas. Our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning. On the autonomous driving datasets Argoverse-HD and BDD100K, we show our proposed method boosts the detection AP over standard Faster R-CNN, with and without finetuning.
arXiv Detail & Related papers (2021-08-27T03:07:55Z)
Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [69.75559390700887]
This paper explores a relatively less-studied methodology based on classification. We propose new techniques to push its frontier in two aspects. Experiments and visual analysis on large-scale public datasets for aerial images show the effectiveness of our approach.
arXiv Detail & Related papers (2020-11-19T05:42:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.