Multi-Source COVID-19 Detection via Variance Risk Extrapolation
- URL: http://arxiv.org/abs/2506.23208v1
- Date: Sun, 29 Jun 2025 12:34:57 GMT
- Title: Multi-Source COVID-19 Detection via Variance Risk Extrapolation
- Authors: Runtian Yuan, Qingqiu Li, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen,
- Abstract summary: Multi-Source COVID-19 Detection Challenge aims to classify chest CT scans into COVID and Non-COVID categories across data collected from four distinct hospitals and medical centers.<n>A major challenge in this task lies in the domain shift caused by variations in imaging protocols, scanners, and patient populations across institutions.<n>We incorporate Variance Risk Extrapolation (VREx) into the training process to enhance the cross-domain generalization of our model.<n>Our method achieves an average macro F1 score of 0.96 across the four sources on the validation set, demonstrating strong generalization.
- Score: 19.844531606142496
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present our solution for the Multi-Source COVID-19 Detection Challenge, which aims to classify chest CT scans into COVID and Non-COVID categories across data collected from four distinct hospitals and medical centers. A major challenge in this task lies in the domain shift caused by variations in imaging protocols, scanners, and patient populations across institutions. To enhance the cross-domain generalization of our model, we incorporate Variance Risk Extrapolation (VREx) into the training process. VREx encourages the model to maintain consistent performance across multiple source domains by explicitly minimizing the variance of empirical risks across environments. This regularization strategy reduces overfitting to center-specific features and promotes learning of domain-invariant representations. We further apply Mixup data augmentation to improve generalization and robustness. Mixup interpolates both the inputs and labels of randomly selected pairs of training samples, encouraging the model to behave linearly between examples and enhancing its resilience to noise and limited data. Our method achieves an average macro F1 score of 0.96 across the four sources on the validation set, demonstrating strong generalization.
Related papers
- OCSVM-Guided Representation Learning for Unsupervised Anomaly Detection [1.0190194769786831]
Unsupervised anomaly detection (UAD) aims to detect anomalies without labeled data.<n>We propose a novel method that tightly couples representation learning with an analytically solvable one-class SVM.<n>The model is evaluated on two tasks: a new benchmark based on MNIST-C, and a challenging brain MRI subtle lesion detection task.
arXiv Detail & Related papers (2025-07-25T13:00:40Z) - ADAptation: Reconstruction-based Unsupervised Active Learning for Breast Ultrasound Diagnosis [11.49367029555765]
Deep learning-based diagnostic models often suffer performance drops due to distribution shifts between training (source) and test (target) domains.<n>We propose a novel unsupervised Active learning framework for Adaptation Domain, named ADAptation.<n>Our method efficiently selects informative samples from multi-domain data pools under limited annotation budget.
arXiv Detail & Related papers (2025-07-01T06:45:02Z) - CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z) - Multi-Dataset Multi-Task Learning for COVID-19 Prognosis [25.371798627482065]
We introduce a novel multi-dataset multi-task training framework that predicts COVID-19 prognostic outcomes from chest X-rays.
Our framework hypothesizes that assessing severity scores enhances the model's ability to classify prognostic severity groups.
arXiv Detail & Related papers (2024-05-22T15:57:44Z) - FORESEE: Multimodal and Multi-view Representation Learning for Robust Prediction of Cancer Survival [3.4686401890974197]
We propose a new end-to-end framework, FORESEE, for robustly predicting patient survival by mining multimodal information.
Cross-fusion transformer effectively utilizes features at the cellular level, tissue level, and tumor heterogeneity level to correlate prognosis.
The hybrid attention encoder (HAE) uses the denoising contextual attention module to obtain the contextual relationship features.
We also propose an asymmetrically masked triplet masked autoencoder to reconstruct lost information within modalities.
arXiv Detail & Related papers (2024-05-13T12:39:08Z) - UniChest: Conquer-and-Divide Pre-training for Multi-Source Chest X-Ray Classification [36.94690613164942]
UniChest is a Conquer-and-Divide pre-training framework, aiming to make full use of the collaboration benefit of multiple sources of CXRs.
We conduct thorough experiments on many benchmarks, e.g., ChestX-ray14, CheXpert, Vindr-CXR, Shenzhen, Open-I and SIIM-ACR Pneumothorax.
arXiv Detail & Related papers (2023-12-18T09:16:48Z) - Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers.
We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes.
We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - A Novel Cross-Perturbation for Single Domain Generalization [54.612933105967606]
Single domain generalization aims to enhance the ability of the model to generalize to unknown domains when trained on a single source domain.
The limited diversity in the training data hampers the learning of domain-invariant features, resulting in compromised generalization performance.
We propose CPerb, a simple yet effective cross-perturbation method to enhance the diversity of the training data.
arXiv Detail & Related papers (2023-08-02T03:16:12Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - Rotation Invariance and Extensive Data Augmentation: a strategy for the
Mitosis Domain Generalization (MIDOG) Challenge [1.52292571922932]
We present the strategy we applied to participate in the MIDOG 2021 competition.
The purpose of the competition was to evaluate the generalization of solutions to images acquired with unseen target scanners.
We propose a solution based on a combination of state-of-the-art deep learning methods.
arXiv Detail & Related papers (2021-09-02T10:09:02Z) - Many-to-One Distribution Learning and K-Nearest Neighbor Smoothing for
Thoracic Disease Identification [83.6017225363714]
deep learning has become the most powerful computer-aided diagnosis technology for improving disease identification performance.
For chest X-ray imaging, annotating large-scale data requires professional domain knowledge and is time-consuming.
In this paper, we propose many-to-one distribution learning (MODL) and K-nearest neighbor smoothing (KNNS) methods to improve a single model's disease identification performance.
arXiv Detail & Related papers (2021-02-26T02:29:30Z) - On the Importance of Diversity in Re-Sampling for Imbalanced Data and
Rare Events in Mortality Risk Models [0.0]
The Surgical Outcome Risk Tool (SORT) is one of the tools developed to predict mortality risk throughout the entire period for major elective in-patient surgeries in the UK.
In this study, we enhance the original SORT prediction model (SORT) by addressing the class imbalance within the dataset.
Our proposed method investigates the application of diversity-based selection on top of common re-sampling techniques.
arXiv Detail & Related papers (2020-12-15T09:45:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.