Complex Wavelet Mutual Information Loss: A Multi-Scale Loss Function for Semantic Segmentation
- URL: http://arxiv.org/abs/2502.00563v1
- Date: Sat, 01 Feb 2025 21:19:48 GMT
- Title: Complex Wavelet Mutual Information Loss: A Multi-Scale Loss Function for Semantic Segmentation
- Authors: Renhao Lu,
- Abstract summary: We propose a novel loss function that leverages mutual information from subband images decomposed by a complex steerable pyramid.<n>CWMI loss achieves significant improvements in both pixel-wise accuracy and topological metrics compared to state-of-the-art methods.
- Score: 0.4662017507844857
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in deep neural networks have significantly enhanced the performance of semantic segmentation. However, class imbalance and instance imbalance remain persistent challenges, where smaller instances and thin boundaries are often overshadowed by larger structures. To address the multiscale nature of segmented objects, various models have incorporated mechanisms such as spatial attention and feature pyramid networks. Despite these advancements, most loss functions are still primarily pixel-wise, while regional and boundary-focused loss functions often incur high computational costs or are restricted to small-scale regions. To address this limitation, we propose complex wavelet mutual information (CWMI) loss, a novel loss function that leverages mutual information from subband images decomposed by a complex steerable pyramid. The complex steerable pyramid captures features across multiple orientations and preserves structural similarity across scales. Meanwhile, mutual information is well-suited for capturing high-dimensional directional features and exhibits greater noise robustness. Extensive experiments on diverse segmentation datasets demonstrate that CWMI loss achieves significant improvements in both pixel-wise accuracy and topological metrics compared to state-of-the-art methods, while introducing minimal computational overhead. The code is available at https://anonymous.4open.science/r/CWMI-83B7/
Related papers
- Steerable Pyramid Weighted Loss: Multi-Scale Adaptive Weighting for Semantic Segmentation [0.4662017507844857]
We propose a novel steerable pyramid-based weighted (SPW) loss function that efficiently generates adaptive weight maps.
Our results demonstrate that the proposed SPW loss function achieves superior pixel precision and segmentation accuracy with minimal computational overhead.
arXiv Detail & Related papers (2025-03-09T13:15:01Z) - Network scaling and scale-driven loss balancing for intelligent poroelastography [2.665036498336221]
A deep learning framework is developed for multiscale characterization of poroelastic media from full waveform data.
Two major challenges impede direct application of existing state-of-the-art techniques for this purpose.
We propose the idea of emphnetwork scaling where the neural property maps are constructed by unit shape functions composed into a scaling layer.
arXiv Detail & Related papers (2024-10-27T23:06:29Z) - A topological description of loss surfaces based on Betti Numbers [8.539445673580252]
We provide a topological measure to evaluate loss complexity in the case of multilayer neural networks.
We find that certain variations in the loss function or model architecture, such as adding an $ell$ regularization term or skip connections in a feedforward network, do not affect loss in specific cases.
arXiv Detail & Related papers (2024-01-08T11:20:04Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - Mutual Information-driven Triple Interaction Network for Efficient Image
Dehazing [54.168567276280505]
We propose a novel Mutual Information-driven Triple interaction Network (MITNet) for image dehazing.
The first stage, named amplitude-guided haze removal, aims to recover the amplitude spectrum of the hazy images for haze removal.
The second stage, named phase-guided structure refined, devotes to learning the transformation and refinement of the phase spectrum.
arXiv Detail & Related papers (2023-08-14T08:23:58Z) - InverseForm: A Loss Function for Structured Boundary-Aware Segmentation [80.39674800972182]
We present a novel boundary-aware loss term for semantic segmentation using an inverse-transformation network.
This plug-in loss term complements the cross-entropy loss in capturing boundary transformations.
We analyze the quantitative and qualitative effects of our loss function on three indoor and outdoor segmentation benchmarks.
arXiv Detail & Related papers (2021-04-06T18:52:45Z) - Topological obstructions in neural networks learning [67.8848058842671]
We study global properties of the loss gradient function flow.
We use topological data analysis of the loss function and its Morse complex to relate local behavior along gradient trajectories with global properties of the loss surface.
arXiv Detail & Related papers (2020-12-31T18:53:25Z) - Sequential Hierarchical Learning with Distribution Transformation for
Image Super-Resolution [83.70890515772456]
We build a sequential hierarchical learning super-resolution network (SHSR) for effective image SR.
We consider the inter-scale correlations of features, and devise a sequential multi-scale block (SMB) to progressively explore the hierarchical information.
Experiment results show SHSR achieves superior quantitative performance and visual quality to state-of-the-art methods.
arXiv Detail & Related papers (2020-07-19T01:35:53Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - An Elastic Interaction-Based Loss Function for Medical Image
Segmentation [10.851295591782538]
This paper introduces a long-range elastic interaction-based training strategy for medical image segmentation.
In this strategy, CNN learns the target region under the guidance of the elastic interaction energy between the boundary of the predicted region and that of the actual object.
Experimental results show that our method is able to achieve considerable improvements compared to commonly used pixel-wise loss functions.
arXiv Detail & Related papers (2020-07-06T11:49:14Z) - Y-net: Multi-scale feature aggregation network with wavelet structure
similarity loss function for single image dehazing [18.479856828292935]
We propose a Y-net that is named for its structure.
This network reconstructs clear images by aggregating multi-scale features maps.
We also propose a Wavelet Structure SIMilarity (W-SSIM) loss function in the training step.
arXiv Detail & Related papers (2020-03-31T02:07:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.