NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge
- URL: http://arxiv.org/abs/2305.04978v3
- Date: Sat, 6 Apr 2024 00:15:25 GMT
- Title: NeuroComparatives: Neuro-Symbolic Distillation of Comparative Knowledge
- Authors: Phillip Howard, Junlin Wang, Vasudev Lal, Gadi Singer, Yejin Choi, Swabha Swayamdipta,
- Abstract summary: We introduce NeuroComparatives, a novel framework for comparative knowledge distillation.
Our framework produces a corpus of up to 8.8M comparisons over 1.74M entity pairs.
Human evaluations show that NeuroComparatives outperform existing resources in terms of validity.
- Score: 48.17483161013775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Comparative knowledge (e.g., steel is stronger and heavier than styrofoam) is an essential component of our world knowledge, yet understudied in prior literature. In this paper, we harvest the dramatic improvements in knowledge capabilities of language models into a large-scale comparative knowledge base. While the ease of acquisition of such comparative knowledge is much higher from extreme-scale models like GPT-4, compared to their considerably smaller and weaker counterparts such as GPT-2, not even the most powerful models are exempt from making errors. We thus ask: to what extent are models at different scales able to generate valid and diverse comparative knowledge? We introduce NeuroComparatives, a novel framework for comparative knowledge distillation overgenerated from language models such as GPT-variants and LLaMA, followed by stringent filtering of the generated knowledge. Our framework acquires comparative knowledge between everyday objects, producing a corpus of up to 8.8M comparisons over 1.74M entity pairs - 10X larger and 30% more diverse than existing resources. Moreover, human evaluations show that NeuroComparatives outperform existing resources in terms of validity (up to 32% absolute improvement). Our acquired NeuroComparatives leads to performance improvements on five downstream tasks. We find that neuro-symbolic manipulation of smaller models offers complementary benefits to the currently dominant practice of prompting extreme-scale language models for knowledge distillation.
Related papers
- The Rise of Parameter Specialization for Knowledge Storage in Large Language Models [50.91855620712756]
We show that as language models become more advanced, their parameters exhibit increased specialization.<n>We experimentally validate that this specialized distribution of knowledge contributes to improving the efficiency of knowledge utilization in these models.
arXiv Detail & Related papers (2025-05-22T20:15:01Z) - Synthetic Knowledge Ingestion: Towards Knowledge Refinement and Injection for Enhancing Large Language Models [1.753683416932648]
Large language models (LLMs) are proficient in capturing factual knowledge across various domains.
In this work, we propose a novel synthetic knowledge ingestion method called Ski.
We then integrate Ski and its variations with three knowledge injection techniques to inject and refine knowledge in language models.
arXiv Detail & Related papers (2024-10-12T19:38:09Z) - Does Knowledge Localization Hold True? Surprising Differences Between Entity and Relation Perspectives in Language Models [20.157061521694096]
This study investigates the differences between entity and relational knowledge through knowledge editing.
To further elucidate the differences between entity and relational knowledge, we employ causal analysis to investigate how relational knowledge is stored in pre-trained models.
This insight highlights the multifaceted nature of knowledge storage in language models, underscoring the complexity of manipulating specific types of knowledge within these models.
arXiv Detail & Related papers (2024-09-01T05:09:11Z) - Towards a Holistic Evaluation of LLMs on Factual Knowledge Recall [31.45796499298925]
Large language models (LLMs) have shown remarkable performance on a variety of NLP tasks.
We focus on assessing LLMs' ability to recall factual knowledge learned from pretraining.
We benchmark 31 models from 10 model families and provide a holistic assessment of their strengths and weaknesses.
arXiv Detail & Related papers (2024-04-24T19:40:01Z) - Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws [51.68385617116854]
Scaling laws describe the relationship between the size of language models and their capabilities.
We focus on factual knowledge represented as domains, such as (USA, capital, Washington D.C.) from a Wikipedia page.
A 7B model can store 14B bits of knowledge, surpassing the English Wikipedia and textbooks combined.
arXiv Detail & Related papers (2024-04-08T11:11:31Z) - Forgetting before Learning: Utilizing Parametric Arithmetic for
Knowledge Updating in Large Language Models [53.52344131257681]
We propose a new paradigm for fine-tuning called F-Learning, which employs parametric arithmetic to facilitate the forgetting of old knowledge and learning of new knowledge.
Experimental results on two publicly available datasets demonstrate that our proposed F-Learning can obviously improve the knowledge updating performance of both full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2023-11-14T09:12:40Z) - Distilling Large Language Models for Biomedical Knowledge Extraction: A
Case Study on Adverse Drug Events [17.73671383380315]
We study how large language models (LLMs) can be used to scale biomedical knowledge curation.
We find that substantial gains can be attained over out-of-box LLMs, with additional advantages such as cost, efficiency, and white-box model access.
arXiv Detail & Related papers (2023-07-12T20:08:48Z) - ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base [51.777618249271725]
ANALOGYKB is a million-scale analogy knowledge base derived from existing knowledge graphs (KGs)
It identifies two types of analogies from the KGs: 1) analogies of the same relations, which can be directly extracted from the KGs, and 2) analogies of analogous relations, which are identified with a selection and filtering pipeline enabled by large language models (LLMs)
arXiv Detail & Related papers (2023-05-10T09:03:01Z) - I2D2: Inductive Knowledge Distillation with NeuroLogic and
Self-Imitation [89.38161262164586]
We study generative models of commonsense knowledge, focusing on the task of generating generics.
We introduce I2D2, a novel commonsense distillation framework that loosely follows the Symbolic Knowledge Distillation of West et al.
Our study leads to a new corpus of generics, Gen-A-tomic, that is the largest and highest quality available to date.
arXiv Detail & Related papers (2022-12-19T04:47:49Z) - SSD-KD: A Self-supervised Diverse Knowledge Distillation Method for
Lightweight Skin Lesion Classification Using Dermoscopic Images [62.60956024215873]
Skin cancer is one of the most common types of malignancy, affecting a large population and causing a heavy economic burden worldwide.
Most studies in skin cancer detection keep pursuing high prediction accuracies without considering the limitation of computing resources on portable devices.
This study specifically proposes a novel method, termed SSD-KD, that unifies diverse knowledge into a generic KD framework for skin diseases classification.
arXiv Detail & Related papers (2022-03-22T06:54:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.