Large Margin Mechanism and Pseudo Query Set on Cross-Domain Few-Shot
Learning
- URL: http://arxiv.org/abs/2005.09218v2
- Date: Tue, 6 Feb 2024 09:21:28 GMT
- Title: Large Margin Mechanism and Pseudo Query Set on Cross-Domain Few-Shot
Learning
- Authors: Jia-Fong Yeh and Hsin-Ying Lee and Bing-Chen Tsai and Yi-Rong Chen and
Ping-Chia Huang and Winston H. Hsu
- Abstract summary: Cross-domain few-shot learning is a new branch of few-shot learning problems.
We propose a novel large margin fine-tuning method (LMM-PQS) to solve this problem.
Our approach is robust and can easily adapt pre-trained models to new domains with few data.
- Score: 34.81309962770134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, few-shot learning problems have received a lot of attention.
While methods in most previous works were trained and tested on datasets in one
single domain, cross-domain few-shot learning is a brand-new branch of few-shot
learning problems, where models handle datasets in different domains between
training and testing phases. In this paper, to solve the problem that the model
is pre-trained (meta-trained) on a single dataset while fine-tuned on datasets
in four different domains, including common objects, satellite images, and
medical images, we propose a novel large margin fine-tuning method (LMM-PQS),
which generates pseudo query images from support images and fine-tunes the
feature extraction modules with a large margin mechanism inspired by methods in
face recognition. According to the experiment results, LMM-PQS surpasses the
baseline models by a significant margin and demonstrates that our approach is
robust and can easily adapt pre-trained models to new domains with few data.
Related papers
- Understanding the Cross-Domain Capabilities of Video-Based Few-Shot Action Recognition Models [3.072340427031969]
Few-shot action recognition (FSAR) aims to learn a model capable of identifying novel actions in videos using only a few examples.
In assuming the base dataset seen during meta-training and novel dataset used for evaluation can come from different domains, cross-domain few-shot learning alleviates data collection and annotation costs.
We systematically evaluate existing state-of-the-art single-domain, transfer-based, and cross-domain FSAR methods on new cross-domain tasks.
arXiv Detail & Related papers (2024-06-03T07:48:18Z) - MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks.
transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection.
Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z) - FissionFusion: Fast Geometric Generation and Hierarchical Souping for Medical Image Analysis [0.7751705157998379]
The scarcity of well-annotated medical datasets requires leveraging transfer learning from broader datasets like ImageNet or pre-trained models like CLIP.
Model soups averages multiple fine-tuned models aiming to improve performance on In-Domain (ID) tasks and enhance robustness against Out-of-Distribution (OOD) datasets.
We propose a hierarchical merging approach that involves local and global aggregation of models at various levels.
arXiv Detail & Related papers (2024-03-20T06:48:48Z) - Toward a Foundation Model for Time Series Data [34.1973242428317]
A foundation model is a machine learning model trained on a large and diverse set of data.
We develop an effective time series foundation model by leveraging unlabeled samples from multiple domains.
arXiv Detail & Related papers (2023-10-05T21:44:50Z) - Delving Deeper into Data Scaling in Masked Image Modeling [145.36501330782357]
We conduct an empirical study on the scaling capability of masked image modeling (MIM) methods for visual recognition.
Specifically, we utilize the web-collected Coyo-700M dataset.
Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models.
arXiv Detail & Related papers (2023-05-24T15:33:46Z) - Exploring Visual Prompts for Whole Slide Image Classification with
Multiple Instance Learning [25.124855361054763]
We present a novel, simple yet effective method for learning domain-specific knowledge transformation from pre-trained models to histopathology images.
Our approach entails using a prompt component to assist the pre-trained model in discerning differences between the pre-trained dataset and the target histopathology dataset.
arXiv Detail & Related papers (2023-03-23T09:23:52Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - CHALLENGER: Training with Attribution Maps [63.736435657236505]
We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
arXiv Detail & Related papers (2022-05-30T13:34:46Z) - Few-shot Weakly-Supervised Object Detection via Directional Statistics [55.97230224399744]
We propose a probabilistic multiple instance learning approach for few-shot Common Object Localization (COL) and few-shot Weakly Supervised Object Detection (WSOD)
Our model simultaneously learns the distribution of the novel objects and localizes them via expectation-maximization steps.
Our experiments show that the proposed method, despite being simple, outperforms strong baselines in few-shot COL and WSOD, as well as large-scale WSOD tasks.
arXiv Detail & Related papers (2021-03-25T22:34:16Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.