HiCo: Hierarchical Contrastive Learning for Ultrasound Video Model
Pretraining
- URL: http://arxiv.org/abs/2210.04477v1
- Date: Mon, 10 Oct 2022 08:07:17 GMT
- Title: HiCo: Hierarchical Contrastive Learning for Ultrasound Video Model
Pretraining
- Authors: Chunhui Zhang and Yixiong Chen and Li Liu and Qiong Liu and Xi Zhou
- Abstract summary: Self-supervised ultrasound (US) video model pretraining can use a small amount of labeled data to achieve one of the most promising results on US diagnosis.
This work proposes a hierarchical contrastive learning (HiCo) method to improve the transferability for the US video model pretraining.
- Score: 22.85475242323536
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The self-supervised ultrasound (US) video model pretraining can use a small
amount of labeled data to achieve one of the most promising results on US
diagnosis. However, it does not take full advantage of multi-level knowledge
for learning deep neural networks (DNNs), and thus is difficult to learn
transferable feature representations. This work proposes a hierarchical
contrastive learning (HiCo) method to improve the transferability for the US
video model pretraining. HiCo introduces both peer-level semantic alignment and
cross-level semantic alignment to facilitate the interaction between different
semantic levels, which can effectively accelerate the convergence speed,
leading to better generalization and adaptation of the learned model.
Additionally, a softened objective function is implemented by smoothing the
hard labels, which can alleviate the negative effect caused by local
similarities of images between different classes. Experiments with HiCo on five
datasets demonstrate its favorable results over state-of-the-art approaches.
The source code of this work is publicly available at
\url{https://github.com/983632847/HiCo}.
Related papers
- Efficient Self-Supervised Video Hashing with Selective State Spaces [63.83300352372051]
Self-supervised video hashing (SSVH) is a practical task in video indexing and retrieval.
We introduce S5VH, a Mamba-based video hashing model with an improved self-supervised learning paradigm.
arXiv Detail & Related papers (2024-12-19T04:33:22Z) - CGLearn: Consistent Gradient-Based Learning for Out-of-Distribution Generalization [0.7366405857677226]
In this work, we introduce a simple yet powerful approach, CGLearn, which relies on the agreement of gradients across various environments.
Our proposed method demonstrates superior performance compared to state-of-the-art methods in both linear and nonlinear settings.
Comprehensive experiments on both synthetic and real-world datasets highlight its effectiveness in diverse scenarios.
arXiv Detail & Related papers (2024-11-09T02:36:39Z) - Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching [53.05954114863596]
We propose a brand-new Deep Boosting Learning (DBL) algorithm for image-text matching.
An anchor branch is first trained to provide insights into the data properties.
A target branch is concurrently tasked with more adaptive margin constraints to further enlarge the relative distance between matched and unmatched samples.
arXiv Detail & Related papers (2024-04-28T08:44:28Z) - Semi-Supervised Class-Agnostic Motion Prediction with Pseudo Label
Regeneration and BEVMix [59.55173022987071]
We study the potential of semi-supervised learning for class-agnostic motion prediction.
Our framework adopts a consistency-based self-training paradigm, enabling the model to learn from unlabeled data.
Our method exhibits comparable performance to weakly and some fully supervised methods.
arXiv Detail & Related papers (2023-12-13T09:32:50Z) - OTMatch: Improving Semi-Supervised Learning with Optimal Transport [2.4355694259330467]
We present a new approach called OTMatch, which leverages semantic relationships among classes by employing an optimal transport loss function to match distributions.
The empirical results show improvements in our method above baseline, this demonstrates the effectiveness and superiority of our approach in harnessing semantic relationships to enhance learning performance in a semi-supervised setting.
arXiv Detail & Related papers (2023-10-26T15:01:54Z) - Multi-behavior Self-supervised Learning for Recommendation [36.42241501002167]
We propose a Multi-Behavior Self-Supervised Learning (MBSSL) framework together with an adaptive optimization method.
Specifically, we devise a behavior-aware graph neural network incorporating the self-attention mechanism to capture behavior multiplicity and dependencies.
Experiments on five real-world datasets demonstrate the consistent improvements obtained by MBSSL over ten state-of-the art (SOTA) baselines.
arXiv Detail & Related papers (2023-05-22T15:57:32Z) - Chaos is a Ladder: A New Theoretical Understanding of Contrastive
Learning via Augmentation Overlap [64.60460828425502]
We propose a new guarantee on the downstream performance of contrastive learning.
Our new theory hinges on the insight that the support of different intra-class samples will become more overlapped under aggressive data augmentations.
We propose an unsupervised model selection metric ARC that aligns well with downstream accuracy.
arXiv Detail & Related papers (2022-03-25T05:36:26Z) - Self-Distilled Self-Supervised Representation Learning [35.60243157730165]
State-of-the-art frameworks in self-supervised learning have recently shown that fully utilizing transformer-based models can lead to performance boost.
In our work, we further exploit this by allowing the intermediate representations to learn from the final layers via the contrastive loss.
Our method, Self-Distilled Self-Supervised Learning (SDSSL), outperforms competitive baselines (SimCLR, BYOL and MoCo v3) using ViT on various tasks and datasets.
arXiv Detail & Related papers (2021-11-25T07:52:36Z) - Dense Contrastive Visual-Linguistic Pretraining [53.61233531733243]
Several multimodal representation learning approaches have been proposed that jointly represent image and text.
These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining.
We propose unbiased Dense Contrastive Visual-Linguistic Pretraining to replace the region regression and classification with cross-modality region contrastive learning.
arXiv Detail & Related papers (2021-09-24T07:20:13Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.