HCS-TNAS: Hybrid Constraint-driven Semi-supervised Transformer-NAS for Ultrasound Image Segmentation
- URL: http://arxiv.org/abs/2407.04203v2
- Date: Fri, 16 Aug 2024 08:27:14 GMT
- Title: HCS-TNAS: Hybrid Constraint-driven Semi-supervised Transformer-NAS for Ultrasound Image Segmentation
- Authors: Renqi Chen, Xinzhe Zheng, Haoyang Su, Kehan Wu,
- Abstract summary: We introduce a hybrid constraint-driven semi-supervised Transformer-NAS (HCS-TNAS) for ultrasound segmentation.
HCS-TNAS includes an Efficient NAS-ViT module for multi-scale token search before ViT's attention calculation, effectively capturing contextual and local information with lower computational costs.
Experiments on public datasets show that HCS-TNAS achieves state-of-the-art performance, pushing the limit of ultrasound segmentation.
- Score: 0.34089646689382486
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Precise ultrasound segmentation is vital for clinicians to provide comprehensive diagnoses. However, developing a model that accurately segments ultrasound images is challenging due to the images' low quality and the scarcity of extensive labeled data. This results in two main solutions: (1) optimizing multi-scale feature representations, and (2) increasing resistance to data dependency. The first approach necessitates an advanced network architecture, but a handcrafted network is knowledge-intensive and often yields limited improvement. In contrast, neural architecture search (NAS) can more easily attain optimal performance, albeit with significant computational costs. Regarding the second issue, semi-supervised learning (SSL) is an established method, but combining it with complex NAS faces the risk of overfitting to a few labeled samples without extra constraints. Therefore, we introduce a hybrid constraint-driven semi-supervised Transformer-NAS (HCS-TNAS), balancing both solutions for segmentation. HCS-TNAS includes an Efficient NAS-ViT module for multi-scale token search before ViT's attention calculation, effectively capturing contextual and local information with lower computational costs, and a hybrid SSL framework that adds network independence and contrastive learning to the optimization for solving data dependency. By further developing a stage-wise optimization strategy, a rational network structure is identified. Experiments on public datasets show that HCS-TNAS achieves state-of-the-art performance, pushing the limit of ultrasound segmentation.
Related papers
- Flexiffusion: Training-Free Segment-Wise Neural Architecture Search for Efficient Diffusion Models [50.260693393896716]
Diffusion models (DMs) are powerful generative models capable of producing high-fidelity images but constrained by high computational costs.<n>We propose Flexiffusion, a training-free NAS framework that jointly optimize generation schedules and model architectures without modifying pre-trained parameters.<n>Our work pioneers a resource-efficient paradigm for searching high-speed DMs without sacrificing quality.
arXiv Detail & Related papers (2025-06-03T06:02:50Z) - Hypergraph Mamba for Efficient Whole Slide Image Understanding [10.285000840656808]
Whole Slide Images (WSIs) in histo pose a significant challenge for medical image analysis due to their ultra-high resolution, massive scale, and intricate spatial relationships.<n>We introduce the WSI-HGMamba, a novel framework that unifies the high-order relational modeling capabilities of the Hypergraph Neural Networks (HGNNs) with the linear-time sequential modeling efficiency of the State Space Models.
arXiv Detail & Related papers (2025-05-23T04:33:54Z) - L-SWAG: Layer-Sample Wise Activation with Gradients information for Zero-Shot NAS on Vision Transformers [39.19675815138566]
Training-free Neural Architecture Search (NAS) efficiently identifies high-performing neural networks using zero-cost (ZC) proxies.<n>ZC-NAS is both (i) time-efficient, eliminating the need for model training, and (ii) interpretable, with proxy designs often theoretically grounded.<n>This work extends ZC proxy applicability to Vision Transformers (ViTs)
arXiv Detail & Related papers (2025-05-12T07:44:52Z) - HSLiNets: Hyperspectral Image and LiDAR Data Fusion Using Efficient Dual Non-Linear Feature Learning Networks [7.06787067270941]
The integration of hyperspectral imaging (HSI) and LiDAR data within new linear feature spaces offers a promising solution to the challenges posed by the high-dimensionality and redundancy inherent in HSIs.
This study introduces a dual linear fused space framework that capitalizes on bidirectional reversed convolutional neural network (CNN) pathways, coupled with a specialized spatial analysis block.
The proposed method not only enhances data processing and classification accuracy, but also mitigates the computational burden typically associated with advanced models such as Transformers.
arXiv Detail & Related papers (2024-11-30T01:08:08Z) - Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation [11.637738540262797]
This study introduces Semi-Mamba-UNet, which integrates a purely visual Mamba-based encoder-decoder architecture with a conventional CNN-based UNet into a semi-supervised learning framework.
This innovative SSL approach leverages both networks to generate pseudo-labels and cross-supervise one another at the pixel level simultaneously.
We introduce a self-supervised pixel-level contrastive learning strategy that employs a pair of projectors to enhance the feature learning capabilities further.
arXiv Detail & Related papers (2024-02-11T17:09:21Z) - ADASR: An Adversarial Auto-Augmentation Framework for Hyperspectral and
Multispectral Data Fusion [54.668445421149364]
Deep learning-based hyperspectral image (HSI) super-resolution aims to generate high spatial resolution HSI (HR-HSI) by fusing hyperspectral image (HSI) and multispectral image (MSI) with deep neural networks (DNNs)
In this letter, we propose a novel adversarial automatic data augmentation framework ADASR that automatically optimize and augments HSI-MSI sample pairs to enrich data diversity for HSI-MSI fusion.
arXiv Detail & Related papers (2023-10-11T07:30:37Z) - SSHNN: Semi-Supervised Hybrid NAS Network for Echocardiographic Image
Segmentation [2.8358100463599722]
We propose a novel semi-supervised hybrid NAS network for accurate medical image segmentation termed SSHNN.
In SSHNN, we creatively use convolution operation in layer-wise feature fusion instead of normalized scalars to avoid losing details.
Specifically, we implement a semi-supervised algorithm Mean-Teacher to overcome the limited volume problem of labeled medical image dataset.
arXiv Detail & Related papers (2023-09-09T03:38:40Z) - HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel
Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels.
We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions.
A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - HASA: Hybrid Architecture Search with Aggregation Strategy for
Echinococcosis Classification and Ovary Segmentation in Ultrasound Images [0.0]
We propose a hybrid NAS framework for ultrasound (US) image classification and segmentation.
Our method can generate more powerful and lightweight models for the above US image classification and segmentation tasks.
arXiv Detail & Related papers (2022-04-14T01:43:00Z) - Federated Split Vision Transformer for COVID-19 CXR Diagnosis using
Task-Agnostic Training [28.309185925167565]
Federated learning enables neural network training for COVID-19 diagnosis on chest X-ray (CXR) images without collecting patient CXR data across multiple hospitals.
We show that Vision Transformer, a recently developed deep learning architecture with straightforward decomposable configuration, is ideally suitable for split learning without sacrificing performance.
Our results affirm the suitability of Transformer for collaborative learning in medical imaging and pave the way forward for future real-world implementations.
arXiv Detail & Related papers (2021-11-02T02:54:30Z) - Trilevel Neural Architecture Search for Efficient Single Image
Super-Resolution [127.92235484598811]
This paper proposes a trilevel neural architecture search (NAS) method for efficient single image super-resolution (SR)
For modeling the discrete search space, we apply a new continuous relaxation on the discrete search spaces to build a hierarchical mixture of network-path, cell-operations, and kernel-width.
An efficient search algorithm is proposed to perform optimization in a hierarchical supernet manner.
arXiv Detail & Related papers (2021-01-17T12:19:49Z) - Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral
Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI.
Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z) - Searching Central Difference Convolutional Networks for Face
Anti-Spoofing [68.77468465774267]
Face anti-spoofing (FAS) plays a vital role in face recognition systems.
Most state-of-the-art FAS methods rely on stacked convolutions and expert-designed network.
Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC)
arXiv Detail & Related papers (2020-03-09T12:48:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.