Neural Approaches to Entity-Centric Information Extraction
- URL: http://arxiv.org/abs/2304.07625v1
- Date: Sat, 15 Apr 2023 20:07:37 GMT
- Title: Neural Approaches to Entity-Centric Information Extraction
- Authors: Klim Zaporojets
- Abstract summary: We introduce a radically different, entity-centric view of the information in text.
We argue that instead of using individual mentions in text to understand their meaning, we should build applications that would work in terms of entity concepts.
In our work, we show that this task can be improved by considering performing entity linking at the coreference cluster level rather than each of the mentions individually.
- Score: 2.8935588665357077
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial Intelligence (AI) has huge impact on our daily lives with
applications such as voice assistants, facial recognition, chatbots,
autonomously driving cars, etc. Natural Language Processing (NLP) is a
cross-discipline of AI and Linguistics, dedicated to study the understanding of
the text. This is a very challenging area due to unstructured nature of the
language, with many ambiguous and corner cases. In this thesis we address a
very specific area of NLP that involves the understanding of entities (e.g.,
names of people, organizations, locations) in text. First, we introduce a
radically different, entity-centric view of the information in text. We argue
that instead of using individual mentions in text to understand their meaning,
we should build applications that would work in terms of entity concepts. Next,
we present a more detailed model on how the entity-centric approach can be used
for the entity linking task. In our work, we show that this task can be
improved by considering performing entity linking at the coreference cluster
level rather than each of the mentions individually. In our next work, we
further study how information from Knowledge Base entities can be integrated
into text. Finally, we analyze the evolution of the entities from the evolving
temporal perspective.
Related papers
- Representing visual classification as a linear combination of words [0.0]
We present an explainability strategy that uses a vision-language model to identify language-based descriptors of a visual classification task.
By leveraging a pre-trained joint embedding space between images and text, our approach estimates a new classification task as a linear combination of words.
We find that the resulting descriptors largely align with clinical knowledge despite a lack of domain-specific language training.
arXiv Detail & Related papers (2023-11-18T02:00:20Z) - Subspace Chronicles: How Linguistic Information Emerges, Shifts and
Interacts during Language Model Training [56.74440457571821]
We analyze tasks covering syntax, semantics and reasoning, across 2M pre-training steps and five seeds.
We identify critical learning phases across tasks and time, during which subspaces emerge, share information, and later disentangle to specialize.
Our findings have implications for model interpretability, multi-task learning, and learning from limited data.
arXiv Detail & Related papers (2023-10-25T09:09:55Z) - An Inclusive Notion of Text [69.36678873492373]
We argue that clarity on the notion of text is crucial for reproducible and generalizable NLP.
We introduce a two-tier taxonomy of linguistic and non-linguistic elements that are available in textual sources and can be used in NLP modeling.
arXiv Detail & Related papers (2022-11-10T14:26:43Z) - A Linguistic Investigation of Machine Learning based Contradiction
Detection Models: An Empirical Analysis and Future Perspectives [0.34998703934432673]
We analyze two Natural Language Inference data sets with respect to their linguistic features.
The goal is to identify those syntactic and semantic properties that are particularly hard to comprehend for a machine learning model.
arXiv Detail & Related papers (2022-10-19T10:06:03Z) - Identifying concept libraries from language about object structure [56.83719358616503]
We leverage natural language descriptions for a diverse set of 2K procedurally generated objects to identify the parts people use.
We formalize our problem as search over a space of program libraries that contain different part concepts.
By combining naturalistic language at scale with structured program representations, we discover a fundamental information-theoretic tradeoff governing the part concepts people name.
arXiv Detail & Related papers (2022-05-11T17:49:25Z) - Imagination-Augmented Natural Language Understanding [71.51687221130925]
We introduce an Imagination-Augmented Cross-modal (iACE) to solve natural language understanding tasks.
iACE enables visual imagination with external knowledge transferred from the powerful generative and pre-trained vision-and-language models.
Experiments on GLUE and SWAG show that iACE achieves consistent improvement over visually-supervised pre-trained models.
arXiv Detail & Related papers (2022-04-18T19:39:36Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z) - Positioning yourself in the maze of Neural Text Generation: A
Task-Agnostic Survey [54.34370423151014]
This paper surveys the components of modeling approaches relaying task impacts across various generation tasks such as storytelling, summarization, translation etc.
We present an abstraction of the imperative techniques with respect to learning paradigms, pretraining, modeling approaches, decoding and the key challenges outstanding in the field in each of them.
arXiv Detail & Related papers (2020-10-14T17:54:42Z) - Relation/Entity-Centric Reading Comprehension [1.0965065178451106]
We study reading comprehension with a focus on understanding entities and their relationships.
We focus on entities and relations because they are typically used to represent the semantics of natural language.
arXiv Detail & Related papers (2020-08-27T06:42:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.