DeNAS-ViT: Data Efficient NAS-Optimized Vision Transformer for Ultrasound Image Segmentation
- URL: http://arxiv.org/abs/2407.04203v3
- Date: Mon, 10 Nov 2025 08:23:25 GMT
- Title: DeNAS-ViT: Data Efficient NAS-Optimized Vision Transformer for Ultrasound Image Segmentation
- Authors: Renqi Chen, Xinzhe Zheng, Haoyang Su, Kehan Wu,
- Abstract summary: We introduce DeNAS-ViT, a data-efficient NAS-optimized Vision Transformer for ultrasound image segmentation.<n>We propose an efficient NAS module that performs multi-scale token search prior to the ViT's attention mechanism.<n>Experiments on public datasets demonstrate that DeNAS-ViT achieves state-of-the-art performance, maintaining robustness with minimal labeled data.
- Score: 7.0775010318400495
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurate segmentation of ultrasound images is essential for reliable medical diagnoses but is challenged by poor image quality and scarce labeled data. Prior approaches have relied on manually designed, complex network architectures to improve multi-scale feature extraction. However, such handcrafted models offer limited gains when prior knowledge is inadequate and are prone to overfitting on small datasets. In this paper, we introduce DeNAS-ViT, a data-efficient NAS-optimized Vision Transformer, the first method to leverage neural architecture search (NAS) for ultrasound image segmentation by automatically optimizing model architecture through token-level search. Specifically, we propose an efficient NAS module that performs multi-scale token search prior to the ViT's attention mechanism, effectively capturing both contextual and local features while minimizing computational costs. Given ultrasound's data scarcity and NAS's inherent data demands, we further develop a NAS-guided semi-supervised learning (SSL) framework. This approach integrates network independence and contrastive learning within a stage-wise optimization strategy, significantly enhancing model robustness under limited-data conditions. Extensive experiments on public datasets demonstrate that DeNAS-ViT achieves state-of-the-art performance, maintaining robustness with minimal labeled data. Moreover, we highlight DeNAS-ViT's generalization potential beyond ultrasound imaging, underscoring its broader applicability.
Related papers
- DDS-NAS: Dynamic Data Selection within Neural Architecture Search via On-line Hard Example Mining applied to Image Classification [13.427500787277031]
We speed up Neural Architecture Search (NAS) training via dynamic hard example mining within a curriculum learning framework.<n>By maximising the contribution of each image sample during training, we reduce the duration of a NAS training cycle and the number of iterations required for convergence.
arXiv Detail & Related papers (2025-06-17T15:58:10Z) - Flexiffusion: Training-Free Segment-Wise Neural Architecture Search for Efficient Diffusion Models [50.260693393896716]
Diffusion models (DMs) are powerful generative models capable of producing high-fidelity images but constrained by high computational costs.<n>We propose Flexiffusion, a training-free NAS framework that jointly optimize generation schedules and model architectures without modifying pre-trained parameters.<n>Our work pioneers a resource-efficient paradigm for searching high-speed DMs without sacrificing quality.
arXiv Detail & Related papers (2025-06-03T06:02:50Z) - Hypergraph Mamba for Efficient Whole Slide Image Understanding [10.285000840656808]
Whole Slide Images (WSIs) in histo pose a significant challenge for medical image analysis due to their ultra-high resolution, massive scale, and intricate spatial relationships.<n>We introduce the WSI-HGMamba, a novel framework that unifies the high-order relational modeling capabilities of the Hypergraph Neural Networks (HGNNs) with the linear-time sequential modeling efficiency of the State Space Models.
arXiv Detail & Related papers (2025-05-23T04:33:54Z) - L-SWAG: Layer-Sample Wise Activation with Gradients information for Zero-Shot NAS on Vision Transformers [39.19675815138566]
Training-free Neural Architecture Search (NAS) efficiently identifies high-performing neural networks using zero-cost (ZC) proxies.<n>ZC-NAS is both (i) time-efficient, eliminating the need for model training, and (ii) interpretable, with proxy designs often theoretically grounded.<n>This work extends ZC proxy applicability to Vision Transformers (ViTs)
arXiv Detail & Related papers (2025-05-12T07:44:52Z) - HSLiNets: Hyperspectral Image and LiDAR Data Fusion Using Efficient Dual Non-Linear Feature Learning Networks [7.06787067270941]
The integration of hyperspectral imaging (HSI) and LiDAR data within new linear feature spaces offers a promising solution to the challenges posed by the high-dimensionality and redundancy inherent in HSIs.
This study introduces a dual linear fused space framework that capitalizes on bidirectional reversed convolutional neural network (CNN) pathways, coupled with a specialized spatial analysis block.
The proposed method not only enhances data processing and classification accuracy, but also mitigates the computational burden typically associated with advanced models such as Transformers.
arXiv Detail & Related papers (2024-11-30T01:08:08Z) - Fair Differentiable Neural Network Architecture Search for Long-Tailed Data with Self-Supervised Learning [0.0]
This paper explores to improve the searching and training performance of NAS on long-tailed datasets.
We first discuss the related works about NAS and the deep learning method for long-tailed datasets.
Then, we focus on an existing work, called SSF-NAS, which integrates the self-supervised learning and fair differentiable NAS.
Finally, we conducted a series of experiments on the CIFAR10-LT dataset for performance evaluation.
arXiv Detail & Related papers (2024-06-19T12:39:02Z) - Accel-NASBench: Sustainable Benchmarking for Accelerator-Aware NAS [3.598880812393792]
We present a technique that allows searching for training proxies that reduce the cost of benchmark construction by significant margins.
We show that the benchmark is accurate and allows searching for state-of-the-art hardware-aware models at zero cost.
arXiv Detail & Related papers (2024-04-09T06:23:41Z) - Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation [11.637738540262797]
This study introduces Semi-Mamba-UNet, which integrates a purely visual Mamba-based encoder-decoder architecture with a conventional CNN-based UNet into a semi-supervised learning framework.
This innovative SSL approach leverages both networks to generate pseudo-labels and cross-supervise one another at the pixel level simultaneously.
We introduce a self-supervised pixel-level contrastive learning strategy that employs a pair of projectors to enhance the feature learning capabilities further.
arXiv Detail & Related papers (2024-02-11T17:09:21Z) - ADASR: An Adversarial Auto-Augmentation Framework for Hyperspectral and
Multispectral Data Fusion [54.668445421149364]
Deep learning-based hyperspectral image (HSI) super-resolution aims to generate high spatial resolution HSI (HR-HSI) by fusing hyperspectral image (HSI) and multispectral image (MSI) with deep neural networks (DNNs)
In this letter, we propose a novel adversarial automatic data augmentation framework ADASR that automatically optimize and augments HSI-MSI sample pairs to enrich data diversity for HSI-MSI fusion.
arXiv Detail & Related papers (2023-10-11T07:30:37Z) - SSHNN: Semi-Supervised Hybrid NAS Network for Echocardiographic Image
Segmentation [2.8358100463599722]
We propose a novel semi-supervised hybrid NAS network for accurate medical image segmentation termed SSHNN.
In SSHNN, we creatively use convolution operation in layer-wise feature fusion instead of normalized scalars to avoid losing details.
Specifically, we implement a semi-supervised algorithm Mean-Teacher to overcome the limited volume problem of labeled medical image dataset.
arXiv Detail & Related papers (2023-09-09T03:38:40Z) - DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models [56.584561770857306]
We propose a novel conditional Neural Architecture Generation (NAG) framework based on diffusion models, dubbed DiffusionNAG.
Specifically, we consider the neural architectures as directed graphs and propose a graph diffusion model for generating them.
We validate the effectiveness of DiffusionNAG through extensive experiments in two predictor-based NAS scenarios: Transferable NAS and Bayesian Optimization (BO)-based NAS.
When integrated into a BO-based algorithm, DiffusionNAG outperforms existing BO-based NAS approaches, particularly in the large MobileNetV3 search space on the ImageNet 1K dataset.
arXiv Detail & Related papers (2023-05-26T13:58:18Z) - HKNAS: Classification of Hyperspectral Imagery Based on Hyper Kernel
Neural Architecture Search [104.45426861115972]
We propose to directly generate structural parameters by utilizing the specifically designed hyper kernels.
We obtain three kinds of networks to separately conduct pixel-level or image-level classifications with 1-D or 3-D convolutions.
A series of experiments on six public datasets demonstrate that the proposed methods achieve state-of-the-art results.
arXiv Detail & Related papers (2023-04-23T17:27:40Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - UnrealNAS: Can We Search Neural Architectures with Unreal Data? [84.78460976605425]
Neural architecture search (NAS) has shown great success in the automatic design of deep neural networks (DNNs)
Previous work has analyzed the necessity of having ground-truth labels in NAS and inspired broad interest.
We take a further step to question whether real data is necessary for NAS to be effective.
arXiv Detail & Related papers (2022-05-04T16:30:26Z) - HASA: Hybrid Architecture Search with Aggregation Strategy for
Echinococcosis Classification and Ovary Segmentation in Ultrasound Images [0.0]
We propose a hybrid NAS framework for ultrasound (US) image classification and segmentation.
Our method can generate more powerful and lightweight models for the above US image classification and segmentation tasks.
arXiv Detail & Related papers (2022-04-14T01:43:00Z) - HyperSegNAS: Bridging One-Shot Neural Architecture Search with 3D
Medical Image Segmentation using HyperNet [51.60655410423093]
We introduce HyperSegNAS to enable one-shot Neural Architecture Search (NAS) for medical image segmentation.
We show that HyperSegNAS yields better performing and more intuitive architectures compared to the previous state-of-the-art (SOTA) segmentation networks.
Our method is evaluated on public datasets from the Medical Decathlon (MSD) challenge, and achieves SOTA performances.
arXiv Detail & Related papers (2021-12-20T16:21:09Z) - ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image
Prior [6.098254376499899]
We show that optimal neural architectures in the DIP framework are image-dependent.
We propose an image-specific NAS strategy for the DIP framework that requires substantially less training than typical NAS approaches.
Our experiments show that image-specific metrics can reduce the search space to a small cohort of models, of which the best model outperforms current NAS approaches for image restoration.
arXiv Detail & Related papers (2021-11-27T13:53:25Z) - Federated Split Vision Transformer for COVID-19 CXR Diagnosis using
Task-Agnostic Training [28.309185925167565]
Federated learning enables neural network training for COVID-19 diagnosis on chest X-ray (CXR) images without collecting patient CXR data across multiple hospitals.
We show that Vision Transformer, a recently developed deep learning architecture with straightforward decomposable configuration, is ideally suitable for split learning without sacrificing performance.
Our results affirm the suitability of Transformer for collaborative learning in medical imaging and pave the way forward for future real-world implementations.
arXiv Detail & Related papers (2021-11-02T02:54:30Z) - Searching Efficient Model-guided Deep Network for Image Denoising [61.65776576769698]
We present a novel approach by connecting model-guided design with NAS (MoD-NAS)
MoD-NAS employs a highly reusable width search strategy and a densely connected search block to automatically select the operations of each layer.
Experimental results on several popular datasets show that our MoD-NAS has achieved even better PSNR performance than current state-of-the-art methods.
arXiv Detail & Related papers (2021-04-06T14:03:01Z) - Trilevel Neural Architecture Search for Efficient Single Image
Super-Resolution [127.92235484598811]
This paper proposes a trilevel neural architecture search (NAS) method for efficient single image super-resolution (SR)
For modeling the discrete search space, we apply a new continuous relaxation on the discrete search spaces to build a hierarchical mixture of network-path, cell-operations, and kernel-width.
An efficient search algorithm is proposed to perform optimization in a hierarchical supernet manner.
arXiv Detail & Related papers (2021-01-17T12:19:49Z) - Binarized Neural Architecture Search for Efficient Object Recognition [120.23378346337311]
Binarized neural architecture search (BNAS) produces extremely compressed models to reduce huge computational cost on embedded devices for edge computing.
An accuracy of $96.53%$ vs. $97.22%$ is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a $40%$ faster search than the state-of-the-art PC-DARTS.
arXiv Detail & Related papers (2020-09-08T15:51:23Z) - Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral
Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI.
Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z) - Searching Central Difference Convolutional Networks for Face
Anti-Spoofing [68.77468465774267]
Face anti-spoofing (FAS) plays a vital role in face recognition systems.
Most state-of-the-art FAS methods rely on stacked convolutions and expert-designed network.
Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC)
arXiv Detail & Related papers (2020-03-09T12:48:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.