Related papers: Compact Twice Fusion Network for Edge Detection

Compact Twice Fusion Network for Edge Detection

URL: http://arxiv.org/abs/2307.04952v1
Date: Tue, 11 Jul 2023 00:46:59 GMT
Title: Compact Twice Fusion Network for Edge Detection
Authors: Yachuan Li, Zongmin Li, Xavier Soria P., Chaozhi Yang, Qian Xiao, Yun Bai, Hua Li, Xiangdong Wang
Abstract summary: The significance of multi-scale features has been gradually recognized by the edge detection community. We propose a Compact Twice Fusion Network (CTFN) to fully integrate multi-scale features. CTFN includes two lightweight multi-scale feature fusion modules.
Score: 5.379716918698048
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The significance of multi-scale features has been gradually recognized by the edge detection community. However, the fusion of multi-scale features increases the complexity of the model, which is not friendly to practical application. In this work, we propose a Compact Twice Fusion Network (CTFN) to fully integrate multi-scale features while maintaining the compactness of the model. CTFN includes two lightweight multi-scale feature fusion modules: a Semantic Enhancement Module (SEM) that can utilize the semantic information contained in coarse-scale features to guide the learning of fine-scale features, and a Pseudo Pixel-level Weighting (PPW) module that aggregate the complementary merits of multi-scale features by assigning weights to all features. Notwithstanding all this, the interference of texture noise makes the correct classification of some pixels still a challenge. For these hard samples, we propose a novel loss function, coined Dynamic Focal Loss, which reshapes the standard cross-entropy loss and dynamically adjusts the weights to correct the distribution of hard samples. We evaluate our method on three datasets, i.e., BSDS500, NYUDv2, and BIPEDv2. Compared with state-of-the-art methods, CTFN achieves competitive accuracy with less parameters and computational cost. Apart from the backbone, CTFN requires only 0.1M additional parameters, which reduces its computation cost to just 60% of other state-of-the-art methods. The codes are available at https://github.com/Li-yachuan/CTFN-pytorch-master.

Related papers

FPDANet: A Multi-Section Classification Model for Intelligent Screening of Fetal Ultrasound [2.255017160735307]
We propose a bilateral multi-scale information fusion network-based FPDANet to address the above challenges.<n>Specifically, we design the positional attention mechanism (DAN) module, which utilizes the similarity of features.<n>In addition, we design a bilateral multi-scale (FPAN) information fusion module to capture contextual and global feature dependencies.
arXiv Detail & Related papers (2025-06-06T13:00:17Z)
Lightweight RGB-D Salient Object Detection from a Speed-Accuracy Tradeoff Perspective [54.91271106816616]
Current RGB-D methods usually leverage large-scale backbones to improve accuracy but sacrifice efficiency.<n>We propose a Speed-Accuracy Tradeoff Network (SATNet) for Lightweight RGB-D SOD from three fundamental perspectives.<n> Concerning depth quality, we introduce the Depth Anything Model to generate high-quality depth maps.<n>For modality fusion, we propose a Decoupled Attention Module (DAM) to explore the consistency within and between modalities.<n>For feature representation, we develop a Dual Information Representation Module (DIRM) with a bi-directional inverted framework.
arXiv Detail & Related papers (2025-05-07T19:37:20Z)
SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection [24.367371441506116]
Multimodal 3D object detection based on deep neural networks has indeed made significant progress. However, it still faces challenges due to the misalignment of scale and spatial information between features extracted from 2D images and those derived from 3D point clouds. We present SSLFusion, a novel scale & Space Aligned Latent Fusion Model, consisting of a scale-aligned fusion strategy, a 3D-to-2D space alignment module, and a latent cross-modal fusion module.
arXiv Detail & Related papers (2025-04-07T15:15:06Z)
FUSE: Label-Free Image-Event Joint Monocular Depth Estimation via Frequency-Decoupled Alignment and Degradation-Robust Fusion [63.87313550399871]
Image-event joint depth estimation methods leverage complementary modalities for robust perception, yet face challenges in generalizability. We propose Self-supervised Transfer (PST) and FrequencyDe-coupled Fusion module (FreDF) PST establishes cross-modal knowledge transfer through latent space alignment with image foundation models. FreDF explicitly decouples high-frequency edge features from low-frequency structural components, resolving modality-specific frequency mismatches.
arXiv Detail & Related papers (2025-03-25T15:04:53Z)
PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection. We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN) PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z)
FuseFormer: A Transformer for Visual and Thermal Image Fusion [3.6064695344878093]
We propose a novel methodology for the image fusion problem that mitigates the limitations associated with using classical evaluation metrics as loss functions. Our approach integrates a transformer-based multi-scale fusion strategy that adeptly addresses local and global context information. Our proposed method, along with the novel loss function definition, demonstrates superior performance compared to other competitive fusion algorithms.
arXiv Detail & Related papers (2024-02-01T19:40:39Z)
Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs. Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z)
CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion [138.40422469153145]
We propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network. We show that CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion.
arXiv Detail & Related papers (2022-11-26T02:40:28Z)
MODNet: Multi-offset Point Cloud Denoising Network Customized for Multi-scale Patches [14.078359217301973]
We propose a Multi-offset Denoising Network (MODNet) customized for multi-scale patches. A multi-scale perception module is designed to embed multi-scale geometric information for each scale feature. Experiments demonstrate that our method achieves new state-of-the-art performance on both synthetic and real-scanned datasets.
arXiv Detail & Related papers (2022-08-30T11:21:39Z)
BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling [60.257912103351394]
We develop a new point cloud upsampling pipeline called BIMS-PU. We decompose the up/downsampling procedure into several up/downsampling sub-steps by breaking the target sampling factor into smaller factors. We show that our method achieves superior results to state-of-the-art approaches.
arXiv Detail & Related papers (2022-06-25T13:13:37Z)
Coarse-to-Fine Embedded PatchMatch and Multi-Scale Dynamic Aggregation for Reference-based Super-Resolution [48.093500219958834]
We propose an Accelerated Multi-Scale Aggregation network (AMSA) for Reference-based Super-Resolution. The proposed AMSA achieves superior performance over state-of-the-art approaches on both quantitative and qualitative evaluations.
arXiv Detail & Related papers (2022-01-12T08:40:23Z)
Learning Robust and Lightweight Model through Separable Structured Transformations [13.208781763887947]
We propose a separable structural transformation of the fully-connected layer to reduce the parameters of convolutional neural networks. We successfully reduce the amount of network parameters by 90%, while the robust accuracy loss is less than 1.5%. We evaluate the proposed approach on datasets such as ImageNet, SVHN, CIFAR-100 and Vision Transformer.
arXiv Detail & Related papers (2021-12-27T07:25:26Z)
EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object Detection [56.03081616213012]
We propose EPNet++ for multi-modal 3D object detection by introducing a novel Cascade Bi-directional Fusion(CB-Fusion) module. The proposed CB-Fusion module boosts the plentiful semantic information of point features with the image features in a cascade bi-directional interaction fusion manner. The experiment results on the KITTI, JRDB and SUN-RGBD datasets demonstrate the superiority of EPNet++ over the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-21T10:48:34Z)
CE-FPN: Enhancing Channel Information for Object Detection [12.954675966833372]
Feature pyramid network (FPN) has been an effective framework to extract multi-scale features in object detection. We present a novel channel enhancement network (CE-FPN) with three simple yet effective modules to alleviate these problems. Our experiments show that CE-FPN achieves competitive performance compared to state-of-the-art FPN-based detectors on MS COCO benchmark.
arXiv Detail & Related papers (2021-03-19T05:51:53Z)
Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels. To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit. Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.