FPDANet: A Multi-Section Classification Model for Intelligent Screening of Fetal Ultrasound
- URL: http://arxiv.org/abs/2506.06054v1
- Date: Fri, 06 Jun 2025 13:00:17 GMT
- Title: FPDANet: A Multi-Section Classification Model for Intelligent Screening of Fetal Ultrasound
- Authors: Minglang Chen, Jie He, Caixu Xu, Bocheng Liang, Shengli Li, Guannan He, Xiongjie Tao,
- Abstract summary: We propose a bilateral multi-scale information fusion network-based FPDANet to address the above challenges.<n>Specifically, we design the positional attention mechanism (DAN) module, which utilizes the similarity of features.<n>In addition, we design a bilateral multi-scale (FPAN) information fusion module to capture contextual and global feature dependencies.
- Score: 2.255017160735307
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: ResNet has been widely used in image classification tasks due to its ability to model the residual dependence of constant mappings for linear computation. However, the ResNet method adopts a unidirectional transfer of features and lacks an effective method to correlate contextual information, which is not effective in classifying fetal ultrasound images in the classification task, and fetal ultrasound images have problems such as low contrast, high similarity, and high noise. Therefore, we propose a bilateral multi-scale information fusion network-based FPDANet to address the above challenges. Specifically, we design the positional attention mechanism (DAN) module, which utilizes the similarity of features to establish the dependency of different spatial positional features and enhance the feature representation. In addition, we design a bilateral multi-scale (FPAN) information fusion module to capture contextual and global feature dependencies at different feature scales, thereby further improving the model representation. FPDANet classification results obtained 91.05\% and 100\% in Top-1 and Top-5 metrics, respectively, and the experimental results proved the effectiveness and robustness of FPDANet.
Related papers
- Cross Spatial Temporal Fusion Attention for Remote Sensing Object Detection via Image Feature Matching [15.57849268814515]
We propose a mechanism that enhances feature representation by integrating scale-invariant keypoints detected independently in both reference and query images.<n>Our approach improves feature matching in two ways: First, by creating correspondence maps that leverage information from multiple image regions simultaneously, and second, by reformulating the similarity matching process as a classification task.<n>To demonstrate the practical utility of improved feature matching, we evaluate CSTF on object detection tasks using the HRSC2016 and DOTA benchmark datasets.
arXiv Detail & Related papers (2025-07-25T09:52:06Z) - MSLAU-Net: A Hybird CNN-Transformer Network for Medical Image Segmentation [7.826754189244901]
Both CNN-based and Transformer-based methods have achieved remarkable success in medical image segmentation tasks.<n>We propose a novel hybrid CNN-Transformer architecture, named MSLAU-Net, which integrates the strengths of both paradigms.<n>The proposed MSLAU-Net incorporates two key ideas. First, it introduces Multi-Scale Linear Attention, designed to efficiently extract multi-scale features from medical images.<n>Second, it adopts a top-down feature aggregation mechanism, which performs multi-level feature aggregation and restores spatial resolution.
arXiv Detail & Related papers (2025-05-24T18:48:29Z) - Unifying Visual and Semantic Feature Spaces with Diffusion Models for Enhanced Cross-Modal Alignment [20.902935570581207]
We introduce a Multimodal Alignment and Reconstruction Network (MARNet) to enhance the model's resistance to visual noise.
MARNet includes a cross-modal diffusion reconstruction module for smoothly and stably blending information across different domains.
Experiments conducted on two benchmark datasets, Vireo-Food172 and Ingredient-101, demonstrate that MARNet effectively improves the quality of image information extracted by the model.
arXiv Detail & Related papers (2024-07-26T16:30:18Z) - Multi-scale Unified Network for Image Classification [33.560003528712414]
CNNs face notable challenges in performance and computational efficiency when dealing with real-world, multi-scale image inputs.
We propose Multi-scale Unified Network (MUSN) consisting of multi-scales, a unified network, and scale-invariant constraint.
MUSN yields an accuracy increase up to 44.53% and diminishes FLOPs by 7.01-16.13% in multi-scale scenarios.
arXiv Detail & Related papers (2024-03-27T06:40:26Z) - You Only Train Once: A Unified Framework for Both Full-Reference and No-Reference Image Quality Assessment [45.62136459502005]
We propose a network to perform full reference (FR) and no reference (NR) IQA.
We first employ an encoder to extract multi-level features from input images.
A Hierarchical Attention (HA) module is proposed as a universal adapter for both FR and NR inputs.
A Semantic Distortion Aware (SDA) module is proposed to examine feature correlations between shallow and deep layers of the encoder.
arXiv Detail & Related papers (2023-10-14T11:03:04Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - HiFuse: Hierarchical Multi-Scale Feature Fusion Network for Medical
Image Classification [16.455887856811465]
This paper proposes a three-branch hierarchical multi-scale feature fusion network structure termed as HiFuse for medical image classification.
The accuracy of our proposed model on the ISIC dataset is 7.6% higher than baseline, 21.5% on the Covid-19 dataset, and 10.4% on the Kvasir dataset.
arXiv Detail & Related papers (2022-09-21T09:30:20Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Efficient and Accurate Multi-scale Topological Network for Single Image
Dehazing [31.543771270803056]
In this paper, we pay attention to the feature extraction and utilization of the input image itself.
We propose a Multi-scale Topological Network (MSTN) to fully explore the features at different scales.
Meanwhile, we design a Multi-scale Feature Fusion Module (MFFM) and an Adaptive Feature Selection Module (AFSM) to achieve the selection and fusion of features at different scales.
arXiv Detail & Related papers (2021-02-24T08:53:14Z) - Learning Deep Interleaved Networks with Asymmetric Co-Attention for
Image Restoration [65.11022516031463]
We present a deep interleaved network (DIN) that learns how information at different states should be combined for high-quality (HQ) images reconstruction.
In this paper, we propose asymmetric co-attention (AsyCA) which is attached at each interleaved node to model the feature dependencies.
Our presented DIN can be trained end-to-end and applied to various image restoration tasks.
arXiv Detail & Related papers (2020-10-29T15:32:00Z) - Interpretable Detail-Fidelity Attention Network for Single Image
Super-Resolution [89.1947690981471]
We propose a purposeful and interpretable detail-fidelity attention network to progressively process smoothes and details in divide-and-conquer manner.
Particularly, we propose a Hessian filtering for interpretable feature representation which is high-profile for detail inference.
Experiments demonstrate that the proposed methods achieve superior performances over the state-of-the-art methods.
arXiv Detail & Related papers (2020-09-28T08:31:23Z) - Out-of-distribution Generalization via Partial Feature Decorrelation [72.96261704851683]
We present a novel Partial Feature Decorrelation Learning (PFDL) algorithm, which jointly optimize a feature decomposition network and the target image classification model.
The experiments on real-world datasets demonstrate that our method can improve the backbone model's accuracy on OOD image classification datasets.
arXiv Detail & Related papers (2020-07-30T05:48:48Z) - ResNeSt: Split-Attention Networks [86.25490825631763]
We present a modularized architecture, which applies the channel-wise attention on different network branches to leverage their success in capturing cross-feature interactions and learning diverse representations.
Our model, named ResNeSt, outperforms EfficientNet in accuracy and latency trade-off on image classification.
arXiv Detail & Related papers (2020-04-19T20:40:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.