Related papers: Towards Concept-Aware Large Language Models

Towards Concept-Aware Large Language Models

URL: http://arxiv.org/abs/2311.01866v1
Date: Fri, 3 Nov 2023 12:19:22 GMT
Title: Towards Concept-Aware Large Language Models
Authors: Chen Shani, Jilles Vreeken, Dafna Shahaf
Abstract summary: Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. There is very little work on endowing machines with the ability to form and reason with concepts. In this work, we analyze how well contemporary large language models (LLMs) capture human concepts and their structure.
Score: 56.48016300758356
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. However, there is very little work on endowing machines with the ability to form and reason with concepts. In particular, state-of-the-art large language models (LLMs) work at the level of tokens, not concepts. In this work, we analyze how well contemporary LLMs capture human concepts and their structure. We then discuss ways to develop concept-aware LLMs, taking place at different stages of the pipeline. We sketch a method for pretraining LLMs using concepts, and also explore the simpler approach that uses the output of existing LLMs. Despite its simplicity, our proof-of-concept is shown to better match human intuition, as well as improve the robustness of predictions. These preliminary results underscore the promise of concept-aware LLMs.

Related papers

Because we have LLMs, we Can and Should Pursue Agentic Interpretability [22.10895793309226]
Large Language Models (LLMs) proactively assist human understanding by developing and leveraging a mental model of the user.<n>Agentic interpretability introduces challenges, particularly in evaluation, due to what we call human-entangled-in-the-loop' nature.<n>Agentic interpretability's promise is to help humans learn the potentially deceptive concepts of the LLMs, rather than see us fall increasingly far from understanding them.
arXiv Detail & Related papers (2025-06-13T18:13:58Z)
Not Minds, but Signs: Reframing LLMs through Semiotics [0.0]
This paper argues for a semiotic perspective on Large Language Models (LLMs)<n>Rather than assuming that LLMs understand language or simulate human thought, we propose that their primary function is to recombine, recontextualize, and circulate linguistic forms.<n>We explore applications in literature, philosophy, education, and cultural production.
arXiv Detail & Related papers (2025-05-20T08:49:18Z)
On the Thinking-Language Modeling Gap in Large Language Models [68.83670974539108]
We show that there is a significant gap between the modeling of languages and thoughts.<n>We propose a new prompt technique termed Language-of-Thoughts (LoT) to demonstrate and alleviate this gap.
arXiv Detail & Related papers (2025-05-19T09:31:52Z)
Aggregate and conquer: detecting and steering LLM concepts by combining nonlinear predictors over multiple layers [16.303681959333883]
We give a general method for detecting semantic concepts in the internal activations of Large Language Models. We show that our methodology can be easily adapted to steer LLMs toward desirable outputs. We highlight the generality of our approach by steering LLMs towards new concepts that, to the best of our knowledge, have not been previously considered.
arXiv Detail & Related papers (2025-02-06T01:41:48Z)
CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification [8.470147509053819]
Concept Bottleneck Models (CBMs) tackle the latter by constraining the final disease prediction on a set of predefined and human-interpretable concepts. We propose a simple, yet effective, methodology, CBVLM, which tackles both of the aforementioned challenges. By grounding the final diagnosis on the predicted concepts, we ensure explainability, and by leveraging the few-shot capabilities of LVLMs, we drastically lower the annotation cost.
arXiv Detail & Related papers (2025-01-21T16:38:04Z)
Enhancing Advanced Visual Reasoning Ability of Large Language Models [20.32900494896848]
Recent advancements in Vision-Language (VL) research have sparked new benchmarks for complex visual reasoning. We propose Complex Visual Reasoning Large Language Models (CVR-LLM) Our approach transforms images into detailed, context-aware descriptions using an iterative self-refinement loop. We also introduce a novel multi-modal in-context learning (ICL) methodology to enhance LLMs' contextual understanding and reasoning.
arXiv Detail & Related papers (2024-09-21T02:10:19Z)
A Concept-Based Explainability Framework for Large Multimodal Models [52.37626977572413]
We propose a dictionary learning based approach, applied to the representation of tokens. We show that these concepts are well semantically grounded in both vision and text. We show that the extracted multimodal concepts are useful to interpret representations of test samples.
arXiv Detail & Related papers (2024-06-12T10:48:53Z)
MyVLM: Personalizing VLMs for User-Specific Queries [78.33252556805931]
We take a first step toward the personalization of vision-language models, enabling them to learn and reason over user-provided concepts. To effectively recognize a variety of user-specific concepts, we augment the VLM with external concept heads that function as toggles for the model. Having recognized the concept, we learn a new concept embedding in the intermediate feature space of the VLM. This embedding is tasked with guiding the language model to naturally integrate the target concept in its generated response.
arXiv Detail & Related papers (2024-03-21T17:51:01Z)
FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z)
If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code) Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
Large Language Models: The Need for Nuance in Current Debates and a Pragmatic Perspective on Understanding [1.3654846342364308]
Large Language Models (LLMs) are unparalleled in their ability to generate grammatically correct, fluent text. This position paper critically assesses three points recurring in critiques of LLM capacities. We outline a pragmatic perspective on the issue of real' understanding and intentionality in LLMs.
arXiv Detail & Related papers (2023-10-30T15:51:04Z)
Concept-Oriented Deep Learning with Large Language Models [0.4548998901594072]
Large Language Models (LLMs) have been successfully used in many natural-language tasks and applications including text generation and AI chatbots. They also are a promising new technology for concept-oriented deep learning (CODL) We discuss conceptual understanding in visual-language LLMs, the most important multimodal LLMs, and major uses of them for CODL including concept extraction from image, concept graph extraction from image, and concept learning.
arXiv Detail & Related papers (2023-06-29T16:47:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.