Domain Embeddings for Generating Complex Descriptions of Concepts in
Italian Language
- URL: http://arxiv.org/abs/2402.16632v1
- Date: Mon, 26 Feb 2024 15:04:35 GMT
- Title: Domain Embeddings for Generating Complex Descriptions of Concepts in
Italian Language
- Authors: Alessandro Maisto
- Abstract summary: We propose a Distributional Semantic resource enriched with linguistic and lexical information extracted from electronic dictionaries.
The resource comprises 21 domain-specific matrices, one comprehensive matrix, and a Graphical User Interface.
Our model facilitates the generation of reasoned semantic descriptions of concepts by selecting matrices directly associated with concrete conceptual knowledge.
- Score: 65.268245109828
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this work, we propose a Distributional Semantic resource enriched with
linguistic and lexical information extracted from electronic dictionaries,
designed to address the challenge of bridging the gap between the continuous
semantic values represented by distributional vectors and the discrete
descriptions offered by general semantics theory. Recently, many researchers
have concentrated on the nexus between embeddings and a comprehensive theory of
semantics and meaning. This often involves decoding the representation of word
meanings in Distributional Models into a set of discrete, manually constructed
properties such as semantic primitives or features, using neural decoding
techniques. Our approach introduces an alternative strategy grounded in
linguistic data. We have developed a collection of domain-specific
co-occurrence matrices, derived from two sources: a classification of Italian
nouns categorized into 4 semantic traits and 20 concrete noun sub-categories,
and a list of Italian verbs classified according to their semantic classes. In
these matrices, the co-occurrence values for each word are calculated
exclusively with a defined set of words pertinent to a particular lexical
domain. The resource comprises 21 domain-specific matrices, one comprehensive
matrix, and a Graphical User Interface. Our model facilitates the generation of
reasoned semantic descriptions of concepts by selecting matrices directly
associated with concrete conceptual knowledge, such as a matrix based on
location nouns and the concept of animal habitats. We assessed the utility of
the resource through two experiments, achieving promising outcomes in both: the
automatic classification of animal nouns and the extraction of animal features.
Related papers
- How well do distributed representations convey contextual lexical semantics: a Thesis Proposal [3.3585951129432323]
In this thesis, we examine the efficacy of distributed representations from modern neural networks in encoding lexical meaning.
We identify four sources of ambiguity based on the relatedness and similarity of meanings influenced by context.
We then aim to evaluate these sources by collecting or constructing multilingual datasets, leveraging various language models, and employing linguistic analysis tools.
arXiv Detail & Related papers (2024-06-02T14:08:51Z) - Contribuci\'on de la sem\'antica combinatoria al desarrollo de
herramientas digitales multiling\"ues [0.0]
This paper describes how the field of Combinatorial Semantics has contributed to the design of three prototypes for the automatic generation of argument patterns in nominal phrases in Spanish, French and German.
It also shows the importance of knowing about the argument syntactic-semantic interface in a production situation in the context of foreign languages.
arXiv Detail & Related papers (2023-12-26T19:32:05Z) - Agentivit\`a e telicit\`a in GilBERTo: implicazioni cognitive [77.71680953280436]
The goal of this study is to investigate whether a Transformer-based neural language model infers lexical semantics.
The semantic properties considered are telicity (also combined with definiteness) and agentivity.
arXiv Detail & Related papers (2023-07-06T10:52:22Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Monolingual alignment of word senses and definitions in lexicographical
resources [0.0]
The focus of this thesis is broadly on the alignment of lexicographical data, particularly dictionaries.
The first task aims to find an optimal alignment given the sense definitions of a headword in two different monolingual dictionaries.
This benchmark can be used for evaluation purposes of word-sense alignment systems.
arXiv Detail & Related papers (2022-09-06T13:09:52Z) - A bilingual approach to specialised adjectives through word embeddings
in the karstology domain [3.92181732547846]
We present an experiment in extracting adjectives which express a specific semantic relation using word embeddings.
The results of the experiment are then thoroughly analysed and categorised into groups of adjectives exhibiting formal or semantic similarity.
arXiv Detail & Related papers (2022-03-31T08:27:15Z) - A cognitively driven weighted-entropy model for embedding semantic
categories in hyperbolic geometry [0.0]
An unsupervised and cognitively driven weighted-entropy method for embedding semantic categories in hyperbolic geometry is proposed.
The model is driven by two fields of research in cognitive linguistics: the statistical learning theory of language acquisition and the proposal of using high-dimensional networks to represent semantic knowledge in cognition.
Results show that this new approach can successfully model and map the semantic relationships of popularity and similarity for most of the basic color and kinship words in English.
arXiv Detail & Related papers (2021-12-13T18:33:45Z) - Decomposing lexical and compositional syntax and semantics with deep
language models [82.81964713263483]
The activations of language transformers like GPT2 have been shown to linearly map onto brain activity during speech comprehension.
Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four classes: lexical, compositional, syntactic, and semantic representations.
The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices.
arXiv Detail & Related papers (2021-03-02T10:24:05Z) - Computational linguistic assessment of textbook and online learning
media by means of threshold concepts in business education [59.003956312175795]
From a linguistic perspective, threshold concepts are instances of specialized vocabularies, exhibiting particular linguistic features.
The profiles of 63 threshold concepts from business education have been investigated in textbooks, newspapers, and Wikipedia.
The three kinds of resources can indeed be distinguished in terms of their threshold concepts' profiles.
arXiv Detail & Related papers (2020-08-05T12:56:16Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.