Grammar-Based Grounded Lexicon Learning
- URL: http://arxiv.org/abs/2202.08806v2
- Date: Thu, 24 Aug 2023 17:46:12 GMT
- Title: Grammar-Based Grounded Lexicon Learning
- Authors: Jiayuan Mao, Haoyue Shi, Jiajun Wu, Roger P. Levy, Joshua B. Tenenbaum
- Abstract summary: G2L2 is a lexicalist approach toward learning a compositional and grounded meaning representation of language.
At the core of G2L2 is a collection of lexicon entries, which map each word to a syntactic type and a neuro-symbolic semantic program.
G2L2 can generalize from small amounts of data to novel compositions of words.
- Score: 68.59500589319023
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Grammar-Based Grounded Lexicon Learning (G2L2), a lexicalist
approach toward learning a compositional and grounded meaning representation of
language from grounded data, such as paired images and texts. At the core of
G2L2 is a collection of lexicon entries, which map each word to a tuple of a
syntactic type and a neuro-symbolic semantic program. For example, the word
shiny has a syntactic type of adjective; its neuro-symbolic semantic program
has the symbolic form {\lambda}x. filter(x, SHINY), where the concept SHINY is
associated with a neural network embedding, which will be used to classify
shiny objects. Given an input sentence, G2L2 first looks up the lexicon entries
associated with each token. It then derives the meaning of the sentence as an
executable neuro-symbolic program by composing lexical meanings based on
syntax. The recovered meaning programs can be executed on grounded inputs. To
facilitate learning in an exponentially-growing compositional space, we
introduce a joint parsing and expected execution algorithm, which does local
marginalization over derivations to reduce the training time. We evaluate G2L2
on two domains: visual reasoning and language-driven navigation. Results show
that G2L2 can generalize from small amounts of data to novel compositions of
words.
Related papers
- Are BabyLMs Second Language Learners? [48.85680614529188]
This paper describes a linguistically-motivated approach to the 2024 edition of the BabyLM Challenge.
Rather than pursuing a first language learning (L1) paradigm, we approach the challenge from a second language (L2) learning perspective.
arXiv Detail & Related papers (2024-10-28T17:52:15Z) - Presence or Absence: Are Unknown Word Usages in Dictionaries? [6.185216877366987]
We evaluate our system in the AXOLOTL-24 shared task for Finnish, Russian and German languages.
We use a graph-based clustering approach to predict mappings between unknown word usages and dictionary entries.
Our system ranks first in Finnish and German, and ranks second in Russian on the Subtask 2 testphase leaderboard.
arXiv Detail & Related papers (2024-06-02T07:57:45Z) - Description-Enhanced Label Embedding Contrastive Learning for Text
Classification [65.01077813330559]
Self-Supervised Learning (SSL) in model learning process and design a novel self-supervised Relation of Relation (R2) classification task.
Relation of Relation Learning Network (R2-Net) for text classification, in which text classification and R2 classification are treated as optimization targets.
external knowledge from WordNet to obtain multi-aspect descriptions for label semantic learning.
arXiv Detail & Related papers (2023-06-15T02:19:34Z) - Always Keep your Target in Mind: Studying Semantics and Improving
Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z) - Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations
with Subwords [28.004852127707025]
Confusion2vec is a word vector representation which encodes ambiguities present in human spoken language.
We show the subword encoding helps better represent the acoustic perceptual ambiguities in human spoken language.
arXiv Detail & Related papers (2021-02-03T20:03:50Z) - Can a Fruit Fly Learn Word Embeddings? [16.280120177501733]
The fruit fly brain is one of the best studied systems in neuroscience.
We show that a network motif can learn semantic representations of words and can generate both static and context-dependent word embeddings.
It is shown that not only can the fruit fly network motif achieve performance comparable to existing methods in NLP, but, additionally, it uses only a fraction of the computational resources.
arXiv Detail & Related papers (2021-01-18T05:41:50Z) - R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic
Matching [58.72111690643359]
We propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching.
We first employ BERT to encode the input sentences from a global perspective.
Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective.
To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task.
arXiv Detail & Related papers (2020-12-16T13:11:30Z) - Embedding Words in Non-Vector Space with Unsupervised Graph Learning [33.51809615505692]
We introduce GraphGlove: unsupervised graph word representations which are learned end-to-end.
In our setting, each word is a node in a weighted graph and the distance between words is the shortest path distance between the corresponding nodes.
We show that our graph-based representations substantially outperform vector-based methods on word similarity and analogy tasks.
arXiv Detail & Related papers (2020-10-06T10:17:49Z) - Graph-Structured Referring Expression Reasoning in The Wild [105.95488002374158]
Grounding referring expressions aims to locate in an image an object referred to by a natural language expression.
We propose a scene graph guided modular network (SGMN) to perform reasoning over a semantic graph and a scene graph.
We also propose Ref-Reasoning, a large-scale real-world dataset for structured referring expression reasoning.
arXiv Detail & Related papers (2020-04-19T11:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.