ICLR: In-Context Learning of Representations
- URL: http://arxiv.org/abs/2501.00070v1
- Date: Sun, 29 Dec 2024 18:58:09 GMT
- Title: ICLR: In-Context Learning of Representations
- Authors: Core Francisco Park, Andrew Lee, Ekdeep Singh Lubana, Yongyi Yang, Maya Okawa, Kento Nishi, Martin Wattenberg, Hidenori Tanaka,
- Abstract summary: We show that as the amount of context is scaled, there is a sudden re-organization from pretrained semantic representations to in-context representations aligned with the graph structure.
Our findings indicate scaling context-size can flexibly re-organize model representations, possibly unlocking novel capabilities.
- Score: 19.331483579806623
- License:
- Abstract: Recent work has demonstrated that semantics specified by pretraining data influence how representations of different concepts are organized in a large language model (LLM). However, given the open-ended nature of LLMs, e.g., their ability to in-context learn, we can ask whether models alter these pretraining semantics to adopt alternative, context-specified ones. Specifically, if we provide in-context exemplars wherein a concept plays a different role than what the pretraining data suggests, do models reorganize their representations in accordance with these novel semantics? To answer this question, we take inspiration from the theory of conceptual role semantics and define a toy "graph tracing" task wherein the nodes of the graph are referenced via concepts seen during training (e.g., apple, bird, etc.) and the connectivity of the graph is defined via some predefined structure (e.g., a square grid). Given exemplars that indicate traces of random walks on the graph, we analyze intermediate representations of the model and find that as the amount of context is scaled, there is a sudden re-organization from pretrained semantic representations to in-context representations aligned with the graph structure. Further, we find that when reference concepts have correlations in their semantics (e.g., Monday, Tuesday, etc.), the context-specified graph structure is still present in the representations, but is unable to dominate the pretrained structure. To explain these results, we analogize our task to energy minimization for a predefined graph topology, providing evidence towards an implicit optimization process to infer context-specified semantics. Overall, our findings indicate scaling context-size can flexibly re-organize model representations, possibly unlocking novel capabilities.
Related papers
- Situational Scene Graph for Structured Human-centric Situation Understanding [15.91717913059569]
We propose a graph-based representation called Situational Scene Graph (SSG) to encode both humanobject-relationships and the corresponding semantic properties.
The semantic details are represented as predefined roles and values inspired by situation frame, which is originally designed to represent a single action.
We will release the code and the dataset soon.
arXiv Detail & Related papers (2024-10-30T09:11:25Z) - PRODIGY: Enabling In-context Learning Over Graphs [112.19056551153454]
In-context learning is the ability of a pretrained model to adapt to novel and diverse downstream tasks.
We develop PRODIGY, the first pretraining framework that enables in-context learning over graphs.
arXiv Detail & Related papers (2023-05-21T23:16:30Z) - Conversational Semantic Parsing using Dynamic Context Graphs [68.72121830563906]
We consider the task of conversational semantic parsing over general purpose knowledge graphs (KGs) with millions of entities, and thousands of relation-types.
We focus on models which are capable of interactively mapping user utterances into executable logical forms.
arXiv Detail & Related papers (2023-05-04T16:04:41Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Learning Representations of Entities and Relations [0.0]
This thesis focuses on improving knowledge graph representation with the aim of tackling the link prediction task.
The first contribution is HypER, a convolutional model which simplifies and improves upon the link prediction performance.
The second contribution is TuckER, a relatively straightforward linear model, which, at the time of its introduction, obtained state-of-the-art link prediction performance.
The third contribution is MuRP, first multi-relational graph representation model embedded in hyperbolic space.
arXiv Detail & Related papers (2022-01-31T09:24:43Z) - Unified Graph Structured Models for Video Understanding [93.72081456202672]
We propose a message passing graph neural network that explicitly models relational-temporal relations.
We show how our method is able to more effectively model relationships between relevant entities in the scene.
arXiv Detail & Related papers (2021-03-29T14:37:35Z) - Learning the Implicit Semantic Representation on Graph-Structured Data [57.670106959061634]
Existing representation learning methods in graph convolutional networks are mainly designed by describing the neighborhood of each node as a perceptual whole.
We propose a Semantic Graph Convolutional Networks (SGCN) that explores the implicit semantics by learning latent semantic-paths in graphs.
arXiv Detail & Related papers (2021-01-16T16:18:43Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - Structured (De)composable Representations Trained with Neural Networks [21.198279941828112]
A template representation refers to the generic representation that captures the characteristics of an entire class.
The proposed technique uses end-to-end deep learning to learn structured and composable representations from input images and discrete labels.
We prove that the representations have a clear structure allowing to decompose the representation into factors that represent classes and environments.
arXiv Detail & Related papers (2020-07-07T10:20:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.