UniCAD: Efficient and Extendable Architecture for Multi-Task Computer-Aided Diagnosis System
- URL: http://arxiv.org/abs/2505.09178v2
- Date: Thu, 15 May 2025 12:49:27 GMT
- Title: UniCAD: Efficient and Extendable Architecture for Multi-Task Computer-Aided Diagnosis System
- Authors: Yitao Zhu, Yuan Yin, Zhenrong Shen, Zihao Zhao, Haiyu Song, Sheng Wang, Dinggang Shen, Qian Wang,
- Abstract summary: We propose UniCAD, a unified architecture that seamlessly handles both 2D and 3D medical images.<n>A low-rank adaptation strategy is employed to adapt a pre-trained visual model to the medical image domain, achieving performance on par with fully fine-tuned counterparts.<n>Building on this unified CAD architecture, we establish an open-source platform where researchers can share and access lightweight CAD experts.
- Score: 48.83716673786449
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing complexity and scale of visual model pre-training have made developing and deploying multi-task computer-aided diagnosis (CAD) systems increasingly challenging and resource-intensive. Furthermore, the medical imaging community lacks an open-source CAD platform to enable the rapid creation of efficient and extendable diagnostic models. To address these issues, we propose UniCAD, a unified architecture that leverages the robust capabilities of pre-trained vision foundation models to seamlessly handle both 2D and 3D medical images while requiring only minimal task-specific parameters. UniCAD introduces two key innovations: (1) Efficiency: A low-rank adaptation strategy is employed to adapt a pre-trained visual model to the medical image domain, achieving performance on par with fully fine-tuned counterparts while introducing only 0.17% trainable parameters. (2) Plug-and-Play: A modular architecture that combines a frozen foundation model with multiple plug-and-play experts, enabling diverse tasks and seamless functionality expansion. Building on this unified CAD architecture, we establish an open-source platform where researchers can share and access lightweight CAD experts, fostering a more equitable and efficient research ecosystem. Comprehensive experiments across 12 diverse medical datasets demonstrate that UniCAD consistently outperforms existing methods in both accuracy and deployment efficiency. The source code and project page are available at https://mii-laboratory.github.io/UniCAD/.
Related papers
- RAG-6DPose: Retrieval-Augmented 6D Pose Estimation via Leveraging CAD as Knowledge Base [112.72361202480154]
We present RAG-6DPose, a retrieval-augmented approach that leverages 3D CAD models as a knowledge base.<n> Experimental results on standard benchmarks and real-world robotic tasks demonstrate the effectiveness and robustness of our approach.
arXiv Detail & Related papers (2025-06-23T17:19:41Z) - VideoCAD: A Large-Scale Video Dataset for Learning UI Interactions and 3D Reasoning from CAD Software [3.668843811005568]
VideoCAD is a large-scale synthetic dataset consisting of over 41K annotated video recordings of CAD operations.<n>VideoCAD offers an order of magnitude higher complexity in UI interaction learning for real-world engineering tasks.<n>We show two important downstream applications of VideoCAD: learning UI interactions from professional precision 3D CAD tools and a visual question-answering benchmark.
arXiv Detail & Related papers (2025-05-30T17:39:52Z) - Seek-CAD: A Self-refined Generative Modeling for 3D Parametric CAD Using Local Inference via DeepSeek [19.441404313543227]
This study is the first investigation to incorporate both visual and Chain-of-Thought (CoT) feedback within the self-refinement mechanism for generating CAD models.<n>We present an innovative 3D CAD model dataset structured around the SSR (Sketch, Sketch-based feature, and Refinements) triple design paradigm.
arXiv Detail & Related papers (2025-05-23T10:11:19Z) - CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images [69.7768227804928]
CADCrafter is an image-to-parametric CAD model generation framework that trains solely on synthetic textureless CAD data.<n>We introduce a geometry encoder to accurately capture diverse geometric features.<n>Our approach can robustly handle real unconstrained CAD images, and even generalize to unseen general objects.
arXiv Detail & Related papers (2025-04-07T06:01:35Z) - PHT-CAD: Efficient CAD Parametric Primitive Analysis with Progressive Hierarchical Tuning [52.681829043446044]
ParaCAD comprises over 10 million annotated drawings for training and 3,000 real-world industrial drawings with complex topological structures and physical constraints for test.<n> PHT-CAD is a novel 2D PPA framework that harnesses the modality alignment and reasoning capabilities of Vision-Language Models.
arXiv Detail & Related papers (2025-03-23T17:24:32Z) - From Idea to CAD: A Language Model-Driven Multi-Agent System for Collaborative Design [0.06749750044497731]
We present an approach that mirrors this team structure with a Vision Language Model (VLM)-based Multi Agent System.<n>A model is generated automatically from sketches and/ or textual descriptions.<n>The resulting model can be refined collaboratively in an iterative validation loop with the user.
arXiv Detail & Related papers (2025-03-06T13:21:27Z) - Geometric Deep Learning for Computer-Aided Design: A Survey [76.3325417461511]
Geometric Deep Learning techniques have become a transformative force in the field of Computer-Aided Design.<n>The ability to process the CAD designs represented by geometric data and to analyze their encoded features enables the identification of similarities.<n>This survey offers a comprehensive overview of learning-based methods in computer-aided design across various categories.
arXiv Detail & Related papers (2024-02-27T17:11:35Z) - ChatCAD+: Towards a Universal and Reliable Interactive CAD using LLMs [48.11532667875847]
ChatCAD+ is a tool to generate high-quality medical reports and provide reliable medical advice.
The Reliable Report Generation module is capable of interpreting medical images and generate high-quality medical reports.
The Reliable Interaction module leverages up-to-date information from reputable medical websites to provide reliable medical advice.
arXiv Detail & Related papers (2023-05-25T12:03:31Z) - AutoCAD: Automatically Generating Counterfactuals for Mitigating
Shortcut Learning [70.70393006697383]
We present AutoCAD, a fully automatic and task-agnostic CAD generation framework.
In this paper, we present AutoCAD, a fully automatic and task-agnostic CAD generation framework.
arXiv Detail & Related papers (2022-11-29T13:39:53Z) - SB-GCN: Structured BREP Graph Convolutional Network for Automatic Mating
of CAD Assemblies [3.732457298487595]
Assembly modeling is not directly applicable to modern CAD systems because it eschews the dominant data structure of modern CAD: parametric boundary representations (BREPs)
We propose SB-GCN, a representation learning scheme on BREPs that retains the topological structure of parts, and use these learned representations to predict CAD type mates.
arXiv Detail & Related papers (2021-05-25T22:07:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.