Selecting Subsets of Source Data for Transfer Learning with Applications
in Metal Additive Manufacturing
- URL: http://arxiv.org/abs/2401.08715v1
- Date: Tue, 16 Jan 2024 00:14:37 GMT
- Title: Selecting Subsets of Source Data for Transfer Learning with Applications
in Metal Additive Manufacturing
- Authors: Yifan Tang, M. Rahmani Dehaghani, Pouyan Sajadi, G. Gary Wang
- Abstract summary: This paper proposes a systematic method to find appropriate subsets of source data based on similarities between the source and target datasets for a given set of limited target domain data.
The proposed method can find a small subset of source data from the same domain with better TL performance in metal AM regression tasks involving different processes and machines.
- Score: 1.9116784879310036
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Considering data insufficiency in metal additive manufacturing (AM), transfer
learning (TL) has been adopted to extract knowledge from source domains (e.g.,
completed printings) to improve the modeling performance in target domains
(e.g., new printings). Current applications use all accessible source data
directly in TL with no regard to the similarity between source and target data.
This paper proposes a systematic method to find appropriate subsets of source
data based on similarities between the source and target datasets for a given
set of limited target domain data. Such similarity is characterized by the
spatial and model distance metrics. A Pareto frontier-based source data
selection method is developed, where the source data located on the Pareto
frontier defined by two similarity distance metrics are selected iteratively.
The method is integrated into an instance-based TL method (decision tree
regression model) and a model-based TL method (fine-tuned artificial neural
network). Both models are then tested on several regression tasks in metal AM.
Comparison results demonstrate that 1) the source data selection method is
general and supports integration with various TL methods and distance metrics,
2) compared with using all source data, the proposed method can find a small
subset of source data from the same domain with better TL performance in metal
AM regression tasks involving different processes and machines, and 3) when
multiple source domains exist, the source data selection method could find the
subset from one source domain to obtain comparable or better TL performance
than the model constructed using data from all source domains.
Related papers
- CONTRAST: Continual Multi-source Adaptation to Dynamic Distributions [42.293444710522294]
Continual Multi-source Adaptation to Dynamic Distributions (CONTRAST) is a novel method that optimally combines multiple source models to adapt to the dynamic test data.
We show that the proposed method is able to optimally combine the source models and prioritize updates to the model least prone to forgetting.
arXiv Detail & Related papers (2024-01-04T22:23:56Z) - Multi-Source Soft Pseudo-Label Learning with Domain Similarity-based
Weighting for Semantic Segmentation [2.127049691404299]
This paper describes a method of domain adaptive training for semantic segmentation using multiple source datasets.
We propose a soft pseudo-label generation method by integrating predicted object probabilities from multiple source models.
arXiv Detail & Related papers (2023-03-02T05:20:36Z) - A Prototype-Oriented Clustering for Domain Shift with Source Privacy [66.67700676888629]
We introduce Prototype-oriented Clustering with Distillation (PCD) to improve the performance and applicability of existing methods.
PCD first constructs a source clustering model by aligning the distributions of prototypes and data.
It then distills the knowledge to the target model through cluster labels provided by the source model while simultaneously clustering the target data.
arXiv Detail & Related papers (2023-02-08T00:15:35Z) - Model-based Transfer Learning for Automatic Optical Inspection based on
domain discrepancy [9.039797705929363]
This research applies model-based TL via domain similarity to improve the overall performance and data augmentation in both target and source domains.
Our research suggests increases in the F1 score and the PR curve up to 20% compared with TL using benchmark datasets.
arXiv Detail & Related papers (2023-01-14T11:32:39Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - Source-Free Domain Adaptation via Distribution Estimation [106.48277721860036]
Domain Adaptation aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain whose data distributions are different.
Recently, Source-Free Domain Adaptation (SFDA) has drawn much attention, which tries to tackle domain adaptation problem without using source data.
In this work, we propose a novel framework called SFDA-DE to address SFDA task via source Distribution Estimation.
arXiv Detail & Related papers (2022-04-24T12:22:19Z) - Multi-Source Domain Adaptation for Object Detection [52.87890831055648]
We propose a unified Faster R-CNN based framework, termed Divide-and-Merge Spindle Network (DMSN)
DMSN can simultaneously enhance domain innative and preserve discriminative power.
We develop a novel pseudo learning algorithm to approximate optimal parameters of pseudo target subset.
arXiv Detail & Related papers (2021-06-30T03:17:20Z) - Unsupervised Multi-source Domain Adaptation Without Access to Source
Data [58.551861130011886]
Unsupervised Domain Adaptation (UDA) aims to learn a predictor model for an unlabeled domain by transferring knowledge from a separate labeled source domain.
We propose a novel and efficient algorithm which automatically combines the source models with suitable weights in such a way that it performs at least as good as the best source model.
arXiv Detail & Related papers (2021-04-05T10:45:12Z) - Do We Really Need to Access the Source Data? Source Hypothesis Transfer
for Unsupervised Domain Adaptation [102.67010690592011]
Unsupervised adaptationUDA (UDA) aims to leverage the knowledge learned from a labeled source dataset to solve similar tasks in a new unlabeled domain.
Prior UDA methods typically require to access the source data when learning to adapt the model.
This work tackles a practical setting where only a trained source model is available and how we can effectively utilize such a model without source data to solve UDA problems.
arXiv Detail & Related papers (2020-02-20T03:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.