CU-ICU: Customizing Unsupervised Instruction-Finetuned Language Models for ICU Datasets via Text-to-Text Transfer Transformer
- URL: http://arxiv.org/abs/2507.13655v1
- Date: Fri, 18 Jul 2025 04:49:41 GMT
- Title: CU-ICU: Customizing Unsupervised Instruction-Finetuned Language Models for ICU Datasets via Text-to-Text Transfer Transformer
- Authors: Teerapong Panboonyuen,
- Abstract summary: We introduce CU-ICU, a method for customizing unsupervised instruction-finetuned language models for ICU datasets.<n> CU-ICU employs a sparse fine-tuning approach that combines few-shot prompting with selective parameter updates.<n>We demonstrate that CU-ICU consistently improves predictive accuracy and interpretability over standard fine-tuning methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Integrating large language models into specialized domains like healthcare presents unique challenges, including domain adaptation and limited labeled data. We introduce CU-ICU, a method for customizing unsupervised instruction-finetuned language models for ICU datasets by leveraging the Text-to-Text Transfer Transformer (T5) architecture. CU-ICU employs a sparse fine-tuning approach that combines few-shot prompting with selective parameter updates, enabling efficient adaptation with minimal supervision. Our evaluation across critical ICU tasks--early sepsis detection, mortality prediction, and clinical note generation--demonstrates that CU-ICU consistently improves predictive accuracy and interpretability over standard fine-tuning methods. Notably, CU-ICU achieves up to a 15% increase in sepsis detection accuracy and a 20% enhancement in generating clinically relevant explanations while updating fewer than 1% of model parameters in its most efficient configuration. These results establish CU-ICU as a scalable, low-overhead solution for delivering accurate and interpretable clinical decision support in real-world ICU environments.
Related papers
- From Generative Modeling to Clinical Classification: A GPT-Based Architecture for EHR Notes [0.0]
This study presents a GPT-based architecture for clinical text classification.<n>Rather than updating all model parameters, the majority of the GPT-2 backbone is frozen.<n>The proposed method is evaluated on radiology reports from the MIMIC-IV-Note dataset.
arXiv Detail & Related papers (2026-01-29T16:33:47Z) - TGC-Net: A Structure-Aware and Semantically-Aligned Framework for Text-Guided Medical Image Segmentation [56.09179939570486]
We propose TGC-Net, a CLIP-based framework focusing on parameter-efficient, task-specific adaptations.<n>TGC-Net achieves state-of-the-art performance with substantially fewer trainable parameters, including notable Dice gains on challenging benchmarks.
arXiv Detail & Related papers (2025-12-24T12:06:26Z) - PULSE-ICU: A Pretrained Unified Long-Sequence Encoder for Multi-task Prediction in Intensive Care Units [0.3277163122167433]
We present PULSE-ICU, a self-supervised foundation model that learns event-level ICU representations from large-scale EHR sequences.<n>A unified embedding module encodes event identity, continuous values, units, and temporal attributes, while a Longformer-based encoder enables efficient modeling of long trajectories.
arXiv Detail & Related papers (2025-11-27T08:10:52Z) - When Swin Transformer Meets KANs: An Improved Transformer Architecture for Medical Image Segmentation [10.656996937993199]
We introduce UKAST, a U-Net like architecture that integrates rational-function based Kolmogorov-Arnold Networks (KANs) into Swin Transformer encoders.<n>UKAST achieves state-of-the-art performance on four diverse 2D and 3D medical image segmentation benchmarks.
arXiv Detail & Related papers (2025-11-06T05:44:57Z) - Improving Representation Learning of Complex Critical Care Data with ICU-BERT [7.287023190850672]
ICU-BERT is a transformer-based model pre-trained on the MIMIC-IV database.<n>It learns robust representations of complex ICU data with minimal preprocessing.<n>It either compares to or surpasses current performance benchmarks by leveraging fine-tuning.
arXiv Detail & Related papers (2025-02-26T22:16:58Z) - Neural Conformal Control for Time Series Forecasting [54.96087475179419]
We introduce a neural network conformal prediction method for time series that enhances adaptivity in non-stationary environments.<n>Our approach acts as a neural controller designed to achieve desired target coverage, leveraging auxiliary multi-view data with neural network encoders.<n>We empirically demonstrate significant improvements in coverage and probabilistic accuracy, and find that our method is the only one that combines good calibration with consistency in prediction intervals.
arXiv Detail & Related papers (2024-12-24T03:56:25Z) - Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation [105.23631749213729]
We propose a novel method for unsupervised pre-training in low-data regimes.
Inspired by the recently successful prompting technique, we introduce a new method, Unsupervised Pre-training with Language-Vision Prompts.
We show that our method can converge faster and perform better than CNN-based models in low-data regimes.
arXiv Detail & Related papers (2024-05-22T06:48:43Z) - Spatially Covariant Image Registration with Text Prompts [10.339385546491284]
TextSCF is a novel method that integrates spatially covariant filters and textual anatomical prompts encoded by visual-language models.
TextSCF boosts computational efficiency but can also retain or improve registration accuracy.
Its performance has been rigorously tested on inter-subject brain MRI and abdominal CT registration tasks.
arXiv Detail & Related papers (2023-11-27T08:00:53Z) - Improving Clinical Decision Support through Interpretable Machine Learning and Error Handling in Electronic Health Records [6.594072648536156]
Trust-MAPS translates clinical domain knowledge into high-dimensional, mixed-integer programming models.<n>Trust-scores emerge as clinically meaningful features that not only boost predictive performance for clinical decision support tasks, but also lend interpretability to ML models.
arXiv Detail & Related papers (2023-08-21T15:14:49Z) - P-Transformer: A Prompt-based Multimodal Transformer Architecture For Medical Tabular Data [2.4688646371447898]
We propose PTransformer, a underlinePrompt-based multimodal underlineTransformer architecture designed specifically for medical tabular data.<n>The framework efficiently encodes diverse modalities from both structured and unstructured data into a harmonized language semantic space.<n>PTransformer demonstrated the improvements with 10.9%/11.0% on RMSE/MAE, 0.5%/2.2% on RMSE/MAE, and 1.6%/0.8% on BACC/AUROC compared to state-of-the-art (SOTA) baselines in predictability.
arXiv Detail & Related papers (2023-03-30T14:25:44Z) - Reprogramming Pretrained Language Models for Protein Sequence
Representation Learning [68.75392232599654]
We propose Representation Learning via Dictionary Learning (R2DL), an end-to-end representation learning framework.
R2DL reprograms a pretrained English language model to learn the embeddings of protein sequences.
Our model can attain better accuracy and significantly improve the data efficiency by up to $105$ times over the baselines set by pretrained and standard supervised methods.
arXiv Detail & Related papers (2023-01-05T15:55:18Z) - Language models are good pathologists: using attention-based sequence
reduction and text-pretrained transformers for efficient WSI classification [0.21756081703275998]
Whole Slide Image (WSI) analysis is usually formulated as a Multiple Instance Learning (MIL) problem.
We introduce textitSeqShort, a sequence shortening layer to summarize each WSI in a fixed- and short-sized sequence of instances.
We show that WSI classification performance can be improved when the downstream transformer architecture has been pre-trained on a large corpus of text data.
arXiv Detail & Related papers (2022-11-14T14:11:31Z) - Summarizing Patients Problems from Hospital Progress Notes Using
Pre-trained Sequence-to-Sequence Models [9.879960506853145]
Problem list summarization requires a model to understand, abstract, and generate clinical documentation.
We propose a new NLP task that aims to generate a list of problems in a patient's daily care plan using input from the provider's progress notes during hospitalization.
arXiv Detail & Related papers (2022-08-17T17:07:35Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - A Meta-embedding-based Ensemble Approach for ICD Coding Prediction [64.42386426730695]
International Classification of Diseases (ICD) are the de facto codes used globally for clinical coding.
These codes enable healthcare providers to claim reimbursement and facilitate efficient storage and retrieval of diagnostic information.
Our proposed approach enhances the performance of neural models by effectively training word vectors using routine medical data as well as external knowledge from scientific articles.
arXiv Detail & Related papers (2021-02-26T17:49:58Z) - Collaborative Boundary-aware Context Encoding Networks for Error Map
Prediction [65.44752447868626]
We propose collaborative boundaryaware context encoding networks called AEP-Net for error prediction task.
Specifically, we propose a collaborative feature transformation branch for better feature fusion between images and masks, and precise localization of error regions.
The AEP-Net achieves an average DSC of 0.8358, 0.8164 for error prediction task, and shows a high Pearson correlation coefficient of 0.9873.
arXiv Detail & Related papers (2020-06-25T12:42:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.