Exploring Category Structure with Contextual Language Models and Lexical
Semantic Networks
- URL: http://arxiv.org/abs/2302.06942v1
- Date: Tue, 14 Feb 2023 09:57:23 GMT
- Title: Exploring Category Structure with Contextual Language Models and Lexical
Semantic Networks
- Authors: Joseph Renner (MAGNET), Pascal Denis (MAGNET), R\'emi Gilleron,
Ang\`ele Brunelli\`ere (SCALab)
- Abstract summary: We test a wider array of methods for probing CLMs for predicting typicality scores.
Our experiments, using BERT, show the importance of using the right type of CLM probes.
Results highlight the importance of polysemy in this task.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work on predicting category structure with distributional models,
using either static word embeddings (Heyman and Heyman, 2019) or contextualized
language models (CLMs) (Misra et al., 2021), report low correlations with human
ratings, thus calling into question their plausibility as models of human
semantic memory. In this work, we revisit this question testing a wider array
of methods for probing CLMs for predicting typicality scores. Our experiments,
using BERT (Devlin et al., 2018), show the importance of using the right type
of CLM probes, as our best BERT-based typicality prediction methods
substantially improve over previous works. Second, our results highlight the
importance of polysemy in this task: our best results are obtained when using a
disambiguation mechanism. Finally, additional experiments reveal that
Information Contentbased WordNet (Miller, 1995), also endowed with
disambiguation, match the performance of the best BERT-based method, and in
fact capture complementary information, which can be combined with BERT to
achieve enhanced typicality predictions.
Related papers
- Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback [50.84142264245052]
This work introduces the Align-SLM framework to enhance the semantic understanding of textless Spoken Language Models (SLMs)
Our approach generates multiple speech continuations from a given prompt and uses semantic metrics to create preference data for Direct Preference Optimization (DPO)
We evaluate the framework using ZeroSpeech 2021 benchmarks for lexical and syntactic modeling, the spoken version of the StoryCloze dataset for semantic coherence, and other speech generation metrics, including the GPT4-o score and human evaluation.
arXiv Detail & Related papers (2024-11-04T06:07:53Z) - Beyond Contrastive Learning: A Variational Generative Model for
Multilingual Retrieval [109.62363167257664]
We propose a generative model for learning multilingual text embeddings.
Our model operates on parallel data in $N$ languages.
We evaluate this method on a suite of tasks including semantic similarity, bitext mining, and cross-lingual question retrieval.
arXiv Detail & Related papers (2022-12-21T02:41:40Z) - Pre-trained Language Models for Keyphrase Generation: A Thorough
Empirical Study [76.52997424694767]
We present an in-depth empirical study of keyphrase extraction and keyphrase generation using pre-trained language models.
We show that PLMs have competitive high-resource performance and state-of-the-art low-resource performance.
Further results show that in-domain BERT-like PLMs can be used to build strong and data-efficient keyphrase generation models.
arXiv Detail & Related papers (2022-12-20T13:20:21Z) - Better Language Model with Hypernym Class Prediction [101.8517004687825]
Class-based language models (LMs) have been long devised to address context sparsity in $n$-gram LMs.
In this study, we revisit this approach in the context of neural LMs.
arXiv Detail & Related papers (2022-03-21T01:16:44Z) - Meeting Summarization with Pre-training and Clustering Methods [6.47783315109491]
HMNetcitehmnet is a hierarchical network that employs both a word-level transformer and a turn-level transformer, as the baseline.
We extend the locate-then-summarize approach of QMSumciteqmsum with an intermediate clustering step.
We compare the performance of our baseline models with BART, a state-of-the-art language model that is effective for summarization.
arXiv Detail & Related papers (2021-11-16T03:14:40Z) - Tracing Origins: Coref-aware Machine Reading Comprehension [43.352833140317486]
We imitated the human's reading process in connecting the anaphoric expressions and leverage the coreference information to enhance the word embeddings from the pre-trained model.
We demonstrated that the explicit incorporation of the coreference information in fine-tuning stage performed better than the incorporation of the coreference information in training a pre-trained language models.
arXiv Detail & Related papers (2021-10-15T09:28:35Z) - Efficient Nearest Neighbor Language Models [114.40866461741795]
Non-parametric neural language models (NLMs) learn predictive distributions of text utilizing an external datastore.
We show how to achieve up to a 6x speed-up in inference speed while retaining comparable performance.
arXiv Detail & Related papers (2021-09-09T12:32:28Z) - Improved Semantic Role Labeling using Parameterized Neighborhood Memory
Adaptation [22.064890647610348]
We propose a parameterized neighborhood memory adaptive (PNMA) method that uses a parameterized representation of the nearest neighbors of tokens in a memory of activations.
We empirically show that PNMA consistently improves the SRL performance of the base model irrespective of types of word embeddings.
arXiv Detail & Related papers (2020-11-29T22:51:25Z) - Coarse-to-Fine Memory Matching for Joint Retrieval and Classification [0.7081604594416339]
We present a novel end-to-end language model for joint retrieval and classification.
We evaluate it on the standard blind test set of the FEVER fact verification dataset.
We extend exemplar auditing to this setting for analyzing and constraining the model.
arXiv Detail & Related papers (2020-11-29T05:06:03Z) - Document Ranking with a Pretrained Sequence-to-Sequence Model [56.44269917346376]
We show how a sequence-to-sequence model can be trained to generate relevance labels as "target words"
Our approach significantly outperforms an encoder-only model in a data-poor regime.
arXiv Detail & Related papers (2020-03-14T22:29:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.