Relational Knowledge Distillation Using Fine-tuned Function Vectors
- URL: http://arxiv.org/abs/2601.08169v1
- Date: Tue, 13 Jan 2026 03:02:18 GMT
- Title: Relational Knowledge Distillation Using Fine-tuned Function Vectors
- Authors: Andrea Kang, Yingnian Wu, Hongjing Lu,
- Abstract summary: Representing relations between concepts is a core prerequisite for intelligent systems to make sense of the world.<n>Recent work using causal mediation analysis has shown that a small set of attention heads encodes task representation in in-context learning.<n>We show that fine-tuning function vectors with only a small set of examples yields better performance on relation-based word-completion tasks.
- Score: 36.277498272417965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Representing relations between concepts is a core prerequisite for intelligent systems to make sense of the world. Recent work using causal mediation analysis has shown that a small set of attention heads encodes task representation in in-context learning, captured in a compact representation known as the function vector. We show that fine-tuning function vectors with only a small set of examples (about 20 word pairs) yields better performance on relation-based word-completion tasks than using the original vectors derived from causal mediation analysis. These improvements hold for both small and large language models. Moreover, the fine-tuned function vectors yield improved decoding performance for relation words and show stronger alignment with human similarity judgments of semantic relations. Next, we introduce the composite function vector - a weighted combination of fine-tuned function vectors - to extract relational knowledge and support analogical reasoning. At inference time, inserting this composite vector into LLM activations markedly enhances performance on challenging analogy problems drawn from cognitive science and SAT benchmarks. Our results highlight the potential of activation patching as a controllable mechanism for encoding and manipulating relational knowledge, advancing both the interpretability and reasoning capabilities of large language models.
Related papers
- Multimodal Function Vectors for Spatial Relations [33.20813174218433]
We show that a small subset of attention heads in the vision-language model OpenFlamingo-4B is responsible for transmitting representations of spatial relations.<n>The activations of these attention heads, termed function vectors, can be extracted and manipulated to alter an LMM's performance on relational tasks.
arXiv Detail & Related papers (2025-10-02T19:55:56Z) - RESOLVE: Relational Reasoning with Symbolic and Object-Level Features Using Vector Symbolic Processing [1.3049516752695616]
We propose RESOLVE, a neuro-vector symbolic architecture that combines object-level features with relational representations in high-dimensional spaces.
By leveraging this design, the model achieves both low compute latency and memory efficiency.
arXiv Detail & Related papers (2024-11-13T02:17:03Z) - Sparse Relational Reasoning with Object-Centric Representations [78.83747601814669]
We investigate the composability of soft-rules learned by relational neural architectures when operating over object-centric representations.
We find that increasing sparsity, especially on features, improves the performance of some models and leads to simpler relations.
arXiv Detail & Related papers (2022-07-15T14:57:33Z) - Prototypical Representation Learning for Relation Extraction [56.501332067073065]
This paper aims to learn predictive, interpretable, and robust relation representations from distantly-labeled data.
We learn prototypes for each relation from contextual information to best explore the intrinsic semantics of relations.
Results on several relation learning tasks show that our model significantly outperforms the previous state-of-the-art relational models.
arXiv Detail & Related papers (2021-03-22T08:11:43Z) - RatE: Relation-Adaptive Translating Embedding for Knowledge Graph
Completion [51.64061146389754]
We propose a relation-adaptive translation function built upon a novel weighted product in complex space.
We then present our Relation-adaptive translating Embedding (RatE) approach to score each graph triple.
arXiv Detail & Related papers (2020-10-10T01:30:30Z) - Multidirectional Associative Optimization of Function-Specific Word
Representations [86.87082468226387]
We present a neural framework for learning associations between interrelated groups of words.
Our model induces a joint function-specific word vector space, where vectors of e.g. plausible SVO compositions lie close together.
The model retains information about word group membership even in the joint space, and can thereby effectively be applied to a number of tasks reasoning over the SVO structure.
arXiv Detail & Related papers (2020-05-11T17:07:20Z) - Probing Linguistic Features of Sentence-Level Representations in Neural
Relation Extraction [80.38130122127882]
We introduce 14 probing tasks targeting linguistic properties relevant to neural relation extraction (RE)
We use them to study representations learned by more than 40 different encoder architecture and linguistic feature combinations trained on two datasets.
We find that the bias induced by the architecture and the inclusion of linguistic features are clearly expressed in the probing task performance.
arXiv Detail & Related papers (2020-04-17T09:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.