Related papers: Cabbage Sweeter than Cake? Analysing the Potential of Large Language Models for Learning Conceptual Spaces

Cabbage Sweeter than Cake? Analysing the Potential of Large Language Models for Learning Conceptual Spaces

URL: http://arxiv.org/abs/2310.05481v1
Date: Mon, 9 Oct 2023 07:41:19 GMT
Title: Cabbage Sweeter than Cake? Analysing the Potential of Large Language Models for Learning Conceptual Spaces
Authors: Usashi Chatterjee, Amit Gajbhiye, Steven Schockaert
Abstract summary: We explore the potential of Large Language Models to learn conceptual spaces. Our experiments show that LLMs can indeed be used for learning meaningful representations. We also find that fine-tuned models of the BERT family are able to match or even outperform the largest GPT-3 model.
Score: 18.312837741635207
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The theory of Conceptual Spaces is an influential cognitive-linguistic framework for representing the meaning of concepts. Conceptual spaces are constructed from a set of quality dimensions, which essentially correspond to primitive perceptual features (e.g. hue or size). These quality dimensions are usually learned from human judgements, which means that applications of conceptual spaces tend to be limited to narrow domains (e.g. modelling colour or taste). Encouraged by recent findings about the ability of Large Language Models (LLMs) to learn perceptually grounded representations, we explore the potential of such models for learning conceptual spaces. Our experiments show that LLMs can indeed be used for learning meaningful representations to some extent. However, we also find that fine-tuned models of the BERT family are able to match or even outperform the largest GPT-3 model, despite being 2 to 3 orders of magnitude smaller.

Related papers

Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding [58.38294408121273]
We propose Cross-modal and Uncertainty-aware Agglomeration for Open-vocabulary 3D Scene Understanding dubbed CUA-O3D. Our method addresses two key challenges: (1) incorporating semantic priors from VLMs alongside the geometric knowledge of spatially-aware vision foundation models, and (2) using a novel deterministic uncertainty estimation to capture model-specific uncertainties.
arXiv Detail & Related papers (2025-03-20T20:58:48Z)
I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data? [79.01538178959726]
Large language models (LLMs) have led many to conclude that they exhibit a form of intelligence. We introduce a novel generative model that generates tokens on the basis of human interpretable concepts represented as latent discrete variables.
arXiv Detail & Related papers (2025-03-12T01:21:17Z)
Large Concept Models: Language Modeling in a Sentence Representation Space [62.73366944266477]
We present an attempt at an architecture which operates on an explicit higher-level semantic representation, which we name a concept. Concepts are language- and modality-agnostic and represent a higher level idea or action in a flow. We show that our model exhibits impressive zero-shot generalization performance to many languages.
arXiv Detail & Related papers (2024-12-11T23:36:20Z)
Does Spatial Cognition Emerge in Frontier Models? [56.47912101304053]
We present SPACE, a benchmark that systematically evaluates spatial cognition in frontier models. Results suggest that contemporary frontier models fall short of the spatial intelligence of animals.
arXiv Detail & Related papers (2024-10-09T01:41:49Z)
Explaining Explainability: Recommendations for Effective Use of Concept Activation Vectors [35.37586279472797]
Concept Vector Activations (CAVs) are learnt using a probe dataset of concept exemplars. We investigate three properties of CAVs: inconsistency across layers, (2) entanglement with other concepts, and (3) spatial dependency. We introduce tools designed to detect the presence of these properties, provide insight into how each property can lead to misleading explanations, and provide recommendations to mitigate their impact.
arXiv Detail & Related papers (2024-04-04T17:46:20Z)
Ranking Entities along Conceptual Space Dimensions with LLMs: An Analysis of Fine-Tuning Strategies [16.056028563680584]
We focus in particular on the task of ranking entities according to a given conceptual space dimension. We analyse whether the ranking capabilities of the resulting models transfer to perceptual and subjective features.
arXiv Detail & Related papers (2024-02-23T14:17:01Z)
Can Large Language Models Follow Concept Annotation Guidelines? A Case Study on Scientific and Financial Domains [19.814974042343028]
We examine the capacity of instruction-tuned language models to follow in-context concept guidelines for sentence labeling tasks. Our results show that although concept definitions consistently help in task performance, only the larger models have limited ability to work under counterfactual contexts.
arXiv Detail & Related papers (2023-11-15T05:11:26Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings [0.7416846035207727]
We take a look at the topological structure of neuronal activity in the "brain" of Chat-GPT's foundation language model, and analyze it with respect to a metric representing the notion of fairness. We first compute a fairness metric, inspired by social literature, to identify factors that typically influence fairness assessments in humans, such as legitimacy, need, and responsibility. Our results show that sentence embeddings based on GPT-3.5 can be decomposed into two submanifolds corresponding to fair and unfair moral judgments.
arXiv Detail & Related papers (2023-09-17T23:38:39Z)
A Geometric Notion of Causal Probing [85.49839090913515]
The linear subspace hypothesis states that, in a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace. We give a set of intrinsic criteria which characterize an ideal linear concept subspace. We find that, for at least one concept across two languages models, the concept subspace can be used to manipulate the concept value of the generated word with precision.
arXiv Detail & Related papers (2023-07-27T17:57:57Z)
Concept2Box: Joint Geometric Embeddings for Learning Two-View Knowledge Graphs [77.10299848546717]
Concept2Box is a novel approach that jointly embeds the two views of a KG. Box embeddings learn the hierarchy structure and complex relations such as overlap and disjoint among them. We propose a novel vector-to-box distance metric and learn both embeddings jointly.
arXiv Detail & Related papers (2023-07-04T21:37:39Z)
Does Deep Learning Learn to Abstract? A Systematic Probing Framework [69.2366890742283]
Abstraction is a desirable capability for deep learning models, which means to induce abstract concepts from concrete instances and flexibly apply them beyond the learning context. We introduce a systematic probing framework to explore the abstraction capability of deep learning models from a transferability perspective.
arXiv Detail & Related papers (2023-02-23T12:50:02Z)
Specializing Smaller Language Models towards Multi-Step Reasoning [56.78474185485288]
We show that abilities can be distilled down from GPT-3.5 ($ge$ 175B) to T5 variants ($le$ 11B) We propose model specialization, to specialize the model's ability towards a target task.
arXiv Detail & Related papers (2023-01-30T08:51:19Z)
On the Transformation of Latent Space in Fine-Tuned NLP Models [21.364053591693175]
We study the evolution of latent space in fine-tuned NLP models. We discover latent concepts in the representational space using hierarchical clustering. We compare pre-trained and fine-tuned models across three models and three downstream tasks.
arXiv Detail & Related papers (2022-10-23T10:59:19Z)
Discovering Latent Concepts Learned in BERT [21.760620298330235]
We study what latent concepts exist in the pre-trained BERT model. We also release a novel BERT ConceptNet dataset (BCN) consisting of 174 concept labels and 1M annotated instances.
arXiv Detail & Related papers (2022-05-15T09:45:34Z)
A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts. In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images. We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.