On the Transformation of Latent Space in Fine-Tuned NLP Models
- URL: http://arxiv.org/abs/2210.12696v1
- Date: Sun, 23 Oct 2022 10:59:19 GMT
- Title: On the Transformation of Latent Space in Fine-Tuned NLP Models
- Authors: Nadir Durrani and Hassan Sajjad and Fahim Dalvi and Firoj Alam
- Abstract summary: We study the evolution of latent space in fine-tuned NLP models.
We discover latent concepts in the representational space using hierarchical clustering.
We compare pre-trained and fine-tuned models across three models and three downstream tasks.
- Score: 21.364053591693175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the evolution of latent space in fine-tuned NLP models. Different
from the commonly used probing-framework, we opt for an unsupervised method to
analyze representations. More specifically, we discover latent concepts in the
representational space using hierarchical clustering. We then use an alignment
function to gauge the similarity between the latent space of a pre-trained
model and its fine-tuned version. We use traditional linguistic concepts to
facilitate our understanding and also study how the model space transforms
towards task-specific information. We perform a thorough analysis, comparing
pre-trained and fine-tuned models across three models and three downstream
tasks. The notable findings of our work are: i) the latent space of the higher
layers evolve towards task-specific concepts, ii) whereas the lower layers
retain generic concepts acquired in the pre-trained model, iii) we discovered
that some concepts in the higher layers acquire polarity towards the output
class, and iv) that these concepts can be used for generating adversarial
triggers.
Related papers
- Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models [65.82564074712836]
We introduce DIFfusionHOI, a new HOI detector shedding light on text-to-image diffusion models.
We first devise an inversion-based strategy to learn the expression of relation patterns between humans and objects in embedding space.
These learned relation embeddings then serve as textual prompts, to steer diffusion models generate images that depict specific interactions.
arXiv Detail & Related papers (2024-10-26T12:00:33Z) - Linking Robustness and Generalization: A k* Distribution Analysis of Concept Clustering in Latent Space for Vision Models [56.89974470863207]
This article uses the k* Distribution, a local neighborhood analysis method, to examine the learned latent space at the level of individual concepts.
We introduce skewness-based true and approximate metrics for interpreting individual concepts to assess the overall quality of vision models' latent space.
arXiv Detail & Related papers (2024-08-17T01:43:51Z) - Universal New Physics Latent Space [0.0]
We develop a machine learning method for mapping data originating from both Standard Model processes and various theories beyond the Standard Model into a unified representation (latent) space.
We apply our method to three examples of new physics at the LHC of increasing complexity, showing that models can be clustered according to their LHC phenomenology.
arXiv Detail & Related papers (2024-07-29T18:00:00Z) - GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images.
We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z) - On the Emergence of Cross-Task Linearity in the Pretraining-Finetuning Paradigm [47.55215041326702]
We discover an intriguing linear phenomenon in models that are from a common pretrained checkpoint and finetuned on different tasks, termed as Cross-Task Linearity (CTL)
We show that if we linearly interpolate the weights of two finetuned models, the features in the weight-interpolated model are often approximately equal to the linearities of features in two finetuned models at each layer.
We conjecture that in the pretraining-finetuning paradigm, neural networks approximately function as linear maps, mapping from the parameter space to the feature space.
arXiv Detail & Related papers (2024-02-06T03:28:36Z) - Uncovering Unique Concept Vectors through Latent Space Decomposition [0.0]
Concept-based explanations have emerged as a superior approach that is more interpretable than feature attribution estimates.
We propose a novel post-hoc unsupervised method that automatically uncovers the concepts learned by deep models during training.
Our experiments reveal that the majority of our concepts are readily understandable to humans, exhibit coherency, and bear relevance to the task at hand.
arXiv Detail & Related papers (2023-07-13T17:21:54Z) - Hierarchical Semantic Tree Concept Whitening for Interpretable Image
Classification [19.306487616731765]
Post-hoc analysis can only discover the patterns or rules that naturally exist in models.
We proactively instill knowledge to alter the representation of human-understandable concepts in hidden layers.
Our method improves model interpretability, showing better disentanglement of semantic concepts, without negatively affecting model classification performance.
arXiv Detail & Related papers (2023-07-10T04:54:05Z) - On the Compositional Generalization Gap of In-Context Learning [73.09193595292233]
We look at the gap between the in-distribution (ID) and out-of-distribution (OOD) performance of such models in semantic parsing tasks with in-context learning.
We evaluate four model families, OPT, BLOOM, CodeGen and Codex on three semantic parsing datasets.
arXiv Detail & Related papers (2022-11-15T19:56:37Z) - Contrastive Neighborhood Alignment [81.65103777329874]
We present Contrastive Neighborhood Alignment (CNA), a manifold learning approach to maintain the topology of learned features.
The target model aims to mimic the local structure of the source representation space using a contrastive loss.
CNA is illustrated in three scenarios: manifold learning, where the model maintains the local topology of the original data in a dimension-reduced space; model distillation, where a small student model is trained to mimic a larger teacher; and legacy model update, where an older model is replaced by a more powerful one.
arXiv Detail & Related papers (2022-01-06T04:58:31Z) - Tensor-based Subspace Factorization for StyleGAN [1.1470070927586016]
$tau$GAN is a tensor-based method for modeling the latent space of generative models.
We validate our approach on StyleGAN trained on FFHQ using BU-3DFE as a structured facial expression database.
arXiv Detail & Related papers (2021-11-08T15:11:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.