Related papers: BayesPrompt: Prompting Large-Scale Pre-Trained Language Models on Few-shot Inference via Debiased Domain Abstraction

BayesPrompt: Prompting Large-Scale Pre-Trained Language Models on Few-shot Inference via Debiased Domain Abstraction

URL: http://arxiv.org/abs/2401.14166v3
Date: Wed, 20 Mar 2024 08:52:42 GMT
Title: BayesPrompt: Prompting Large-Scale Pre-Trained Language Models on Few-shot Inference via Debiased Domain Abstraction
Authors: Jiangmeng Li, Fei Song, Yifan Jin, Wenwen Qiang, Changwen Zheng, Fuchun Sun, Hui Xiong,
Abstract summary: We show that prompt-tuning methods fail to generalize to specific few-shot patterns. We propose BayesPrompt to learn prompts that contain the domain discriminative information. Our method achieves state-of-the-art performance on benchmarks.
Score: 40.226752807366864
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As a novel and effective fine-tuning paradigm based on large-scale pre-trained language models (PLMs), prompt-tuning aims to reduce the gap between downstream tasks and pre-training objectives. While prompt-tuning has yielded continuous advancements in various tasks, such an approach still remains a persistent defect: prompt-tuning methods fail to generalize to specific few-shot patterns. From the perspective of distribution analyses, we disclose that the intrinsic issues behind the phenomenon are the over-multitudinous conceptual knowledge contained in PLMs and the abridged knowledge for target downstream domains, which jointly result in that PLMs mis-locate the knowledge distributions corresponding to the target domains in the universal knowledge embedding space. To this end, we intuitively explore to approximate the unabridged target domains of downstream tasks in a debiased manner, and then abstract such domains to generate discriminative prompts, thereby providing the de-ambiguous guidance for PLMs. Guided by such an intuition, we propose a simple yet effective approach, namely BayesPrompt, to learn prompts that contain the domain discriminative information against the interference from domain-irrelevant knowledge. BayesPrompt primitively leverages known distributions to approximate the debiased factual distributions of target domains and further uniformly samples certain representative features from the approximated distributions to generate the ultimate prompts for PLMs. We provide theoretical insights with the connection to domain adaptation. Empirically, our method achieves state-of-the-art performance on benchmarks.

Related papers

Uncover and Unlearn Nuisances: Agnostic Fully Test-Time Adaptation [46.46865123564661]
Fully Test-Time Adaptation (FTTA) addresses domain shifts without access to source data and training protocols of the pre-trained models.<n>Traditional strategies that align source and target feature distributions are infeasible in FTTA due to the absence of training data and unpredictable target domains.<n>We propose Agnostic FTTA as a novel formulation that enables the usage of off-the-shelf domain transformations during test-time to enable direct generalization to unforeseeable target data.
arXiv Detail & Related papers (2025-11-16T08:01:50Z)
Beyond Benchmarks: A Novel Framework for Domain-Specific LLM Evaluation and Knowledge Mapping [0.7555681642774916]
The paper addresses two critical challenges in language model (LM) evaluation: creating reliable domain-specific benchmarks and understanding knowledge representation during domain adaptation.<n>We introduce a deterministic pipeline that converts raw domain corpora into completion-type benchmarks without relying on LMs or human curation.<n>Our approach generates domain-specific keywords and related word lists using TF and Term TF-IDF methods and constructs prompt-target pairs.<n>We evaluate models by measuring their ability to complete these prompts with the correct domain-specific targets, providing a direct assessment of domain knowledge with low computational cost.
arXiv Detail & Related papers (2025-06-09T11:30:12Z)
Diffusion Features to Bridge Domain Gap for Semantic Segmentation [2.8616666231199424]
This paper investigates the approach that leverages the sampling and fusion techniques to harness the features of diffusion models efficiently. By leveraging the strength of text-to-image generation capability, we introduce a new training framework designed to implicitly learn posterior knowledge from it.
arXiv Detail & Related papers (2024-06-02T15:33:46Z)
Learning Transferable Conceptual Prototypes for Interpretable Unsupervised Domain Adaptation [79.22678026708134]
In this paper, we propose an inherently interpretable method, named Transferable Prototype Learning ( TCPL) To achieve this goal, we design a hierarchically prototypical module that transfers categorical basic concepts from the source domain to the target domain and learns domain-shared prototypes for explaining the underlying reasoning process. Comprehensive experiments show that the proposed method can not only provide effective and intuitive explanations but also outperform previous state-of-the-arts.
arXiv Detail & Related papers (2023-10-12T06:36:41Z)
Domain Adaptive Few-Shot Open-Set Learning [36.39622440120531]
We propose Domain Adaptive Few-Shot Open Set Recognition (DA-FSOS) and introduce a meta-learning-based architecture named DAFOSNET. Our training approach ensures that DAFOS-NET can generalize well to new scenarios in the target domain. We present three benchmarks for DA-FSOS based on the Office-Home, mini-ImageNet/CUB, and DomainNet datasets.
arXiv Detail & Related papers (2023-09-22T12:04:47Z)
MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental Learning [12.737883740101438]
We present a novel DIL approach based on a mixture of prompt-tuned CLIP models (MoP-CLIP) At the training stage we model the features distribution of every class in each domain, learning individual text and visual prompts to adapt to a given domain. At inference, the learned distributions allow us to identify whether a given test sample belongs to a known domain, selecting the correct prompt for the classification task, or from an unseen domain.
arXiv Detail & Related papers (2023-07-11T18:17:50Z)
Prompting Diffusion Representations for Cross-Domain Semantic Segmentation [101.04326113360342]
diffusion-pretraining achieves extraordinary domain generalization results for semantic segmentation. We introduce a scene prompt and a prompt randomization strategy to help further disentangle the domain-invariant information when training the segmentation head.
arXiv Detail & Related papers (2023-07-05T09:28:25Z)
Domain-Specific Risk Minimization for Out-of-Distribution Generalization [104.17683265084757]
We first establish a generalization bound that explicitly considers the adaptivity gap. We propose effective gap estimation methods for guiding the selection of a better hypothesis for the target. The other method is minimizing the gap directly by adapting model parameters using online target samples.
arXiv Detail & Related papers (2022-08-18T06:42:49Z)
Coarse to Fine: Domain Adaptive Crowd Counting via Adversarial Scoring Network [58.05473757538834]
This paper proposes a novel adversarial scoring network (ASNet) to bridge the gap across domains from coarse to fine granularity. Three sets of migration experiments show that the proposed methods achieve state-of-the-art counting performance.
arXiv Detail & Related papers (2021-07-27T14:47:24Z)
Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA) We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.