Do Neural Models Learn Systematicity of Monotonicity Inference in
Natural Language?
- URL: http://arxiv.org/abs/2004.14839v2
- Date: Sat, 2 May 2020 12:35:41 GMT
- Title: Do Neural Models Learn Systematicity of Monotonicity Inference in
Natural Language?
- Authors: Hitomi Yanaka, Koji Mineshima, Daisuke Bekki, and Kentaro Inui
- Abstract summary: We introduce a method for evaluating whether neural models can learn systematicity of monotonicity inference in natural language.
We consider four aspects of monotonicity inferences and test whether the models can systematically interpret lexical and logical phenomena on different training/test splits.
- Score: 41.649440404203595
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the success of language models using neural networks, it remains
unclear to what extent neural models have the generalization ability to perform
inferences. In this paper, we introduce a method for evaluating whether neural
models can learn systematicity of monotonicity inference in natural language,
namely, the regularity for performing arbitrary inferences with generalization
on composition. We consider four aspects of monotonicity inferences and test
whether the models can systematically interpret lexical and logical phenomena
on different training/test splits. A series of experiments show that three
neural models systematically draw inferences on unseen combinations of lexical
and logical phenomena when the syntactic structures of the sentences are
similar between the training and test sets. However, the performance of the
models significantly decreases when the structures are slightly changed in the
test set while retaining all vocabularies and constituents already appearing in
the training set. This indicates that the generalization ability of neural
models is limited to cases where the syntactic structures are nearly the same
as those in the training set.
Related papers
- Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - Contextual modulation of language comprehension in a dynamic neural model of lexical meaning [0.0]
We demonstrate the architecture and behavior of the model using as a test case the English lexical item 'have', focusing on its polysemous use.
Results support a novel perspective on lexical polysemy: that the many related meanings of a word are metastable neural activation states.
arXiv Detail & Related papers (2024-07-19T23:28:55Z) - Montague semantics and modifier consistency measurement in neural
language models [1.6799377888527685]
This work proposes a methodology for measuring compositional behavior in contemporary language models.
Specifically, we focus on adjectival modifier phenomena in adjective-noun phrases.
Our experimental results indicate that current neural language models behave according to the expected linguistic theories to a limited extent only.
arXiv Detail & Related papers (2022-10-10T18:43:16Z) - Learning Disentangled Representations for Natural Language Definitions [0.0]
We argue that recurrent syntactic and semantic regularities in textual data can be used to provide the models with both structural biases and generative factors.
We leverage the semantic structures present in a representative and semantically dense category of sentence types, definitional sentences, for training a Variational Autoencoder to learn disentangled representations.
arXiv Detail & Related papers (2022-09-22T14:31:55Z) - Dependency-based Mixture Language Models [53.152011258252315]
We introduce the Dependency-based Mixture Language Models.
In detail, we first train neural language models with a novel dependency modeling objective.
We then formulate the next-token probability by mixing the previous dependency modeling probability distributions with self-attention.
arXiv Detail & Related papers (2022-03-19T06:28:30Z) - Causal Abstractions of Neural Networks [9.291492712301569]
We propose a new structural analysis method grounded in a formal theory of textitcausal abstraction.
We apply this method to analyze neural models trained on Multiply Quantified Natural Language Inference (MQNLI) corpus.
arXiv Detail & Related papers (2021-06-06T01:07:43Z) - Discrete representations in neural models of spoken language [56.29049879393466]
We compare the merits of four commonly used metrics in the context of weakly supervised models of spoken language.
We find that the different evaluation metrics can give inconsistent results.
arXiv Detail & Related papers (2021-05-12T11:02:02Z) - Structural Supervision Improves Few-Shot Learning and Syntactic
Generalization in Neural Language Models [47.42249565529833]
Humans can learn structural properties about a word from minimal experience.
We assess the ability of modern neural language models to reproduce this behavior in English.
arXiv Detail & Related papers (2020-10-12T14:12:37Z) - Mechanisms for Handling Nested Dependencies in Neural-Network Language
Models and Humans [75.15855405318855]
We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing.
Although the network was solely trained to predict the next word in a large corpus, analysis showed the emergence of specialized units that successfully handled local and long-distance syntactic agreement.
We tested the model's predictions in a behavioral experiment where humans detected violations in number agreement in sentences with systematic variations in the singular/plural status of multiple nouns.
arXiv Detail & Related papers (2020-06-19T12:00:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.