Investigating the Role of Centering Theory in the Context of Neural
Coreference Resolution Systems
- URL: http://arxiv.org/abs/2210.14678v1
- Date: Wed, 26 Oct 2022 12:55:26 GMT
- Title: Investigating the Role of Centering Theory in the Context of Neural
Coreference Resolution Systems
- Authors: Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan
- Abstract summary: We investigate the connection between centering theory and modern coreference resolution systems.
We show that high-quality neural coreference resolvers may not benefit much from explicitly modeling centering ideas.
We formulate a version of CT that also models recency and show that it captures coreference information better compared to vanilla CT.
- Score: 71.57556446474486
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Centering theory (CT; Grosz et al., 1995) provides a linguistic analysis of
the structure of discourse. According to the theory, local coherence of
discourse arises from the manner and extent to which successive utterances make
reference to the same entities. In this paper, we investigate the connection
between centering theory and modern coreference resolution systems. We provide
an operationalization of centering and systematically investigate if neural
coreference resolvers adhere to the rules of centering theory by defining
various discourse metrics and developing a search-based methodology. Our
information-theoretic analysis reveals a positive dependence between
coreference and centering; but also shows that high-quality neural coreference
resolvers may not benefit much from explicitly modeling centering ideas. Our
analysis further shows that contextualized embeddings contain much of the
coherence information, which helps explain why CT can only provide little gains
to modern neural coreference resolvers which make use of pretrained
representations. Finally, we discuss factors that contribute to coreference
which are not modeled by CT such as world knowledge and recency bias. We
formulate a version of CT that also models recency and show that it captures
coreference information better compared to vanilla CT.
Related papers
- Understanding Distributed Representations of Concepts in Deep Neural
Networks without Supervision [25.449397570387802]
We propose an unsupervised method for discovering distributed representations of concepts by selecting a principal subset of neurons.
Our empirical findings demonstrate that instances with similar neuron activation states tend to share coherent concepts.
It can be utilized to identify unlabeled subclasses within data and to detect the causes of misclassifications.
arXiv Detail & Related papers (2023-12-28T07:33:51Z) - Modeling Hierarchical Reasoning Chains by Linking Discourse Units and
Key Phrases for Reading Comprehension [80.99865844249106]
We propose a holistic graph network (HGN) which deals with context at both discourse level and word level, as the basis for logical reasoning.
Specifically, node-level and type-level relations, which can be interpreted as bridges in the reasoning process, are modeled by a hierarchical interaction mechanism.
arXiv Detail & Related papers (2023-06-21T07:34:27Z) - Networked Communication for Decentralised Agents in Mean-Field Games [59.01527054553122]
We introduce networked communication to the mean-field game framework.
We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases.
arXiv Detail & Related papers (2023-06-05T10:45:39Z) - Learning a Structural Causal Model for Intuition Reasoning in
Conversation [20.243323155177766]
Reasoning, a crucial aspect of NLP research, has not been adequately addressed by prevailing models.
We develop a conversation cognitive model ( CCM) that explains how each utterance receives and activates channels of information.
By leveraging variational inference, it explores substitutes for implicit causes, addresses the issue of their unobservability, and reconstructs the causal representations of utterances through the evidence lower bounds.
arXiv Detail & Related papers (2023-05-28T13:54:09Z) - Understanding Imbalanced Semantic Segmentation Through Neural Collapse [81.89121711426951]
We show that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes.
We introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure.
Our method ranks 1st and sets a new record on the ScanNet200 test leaderboard.
arXiv Detail & Related papers (2023-01-03T13:51:51Z) - Biologically-informed deep learning models for cancer: fundamental
trends for encoding and interpreting oncology data [0.0]
We provide a structured literature analysis focused on Deep Learning (DL) models used to support inference in cancer biology.
The work focuses on how existing models address the need for better dialogue with prior knowledge, biological plausibility and interpretability.
arXiv Detail & Related papers (2022-07-02T12:11:35Z) - Using Causal Analysis for Conceptual Deep Learning Explanation [11.552000005640203]
An ideal explanation resembles the decision-making process of a domain expert.
We take advantage of radiology reports accompanying chest X-ray images to define concepts.
We construct a low-depth decision tree to translate all the discovered concepts into a straightforward decision rule.
arXiv Detail & Related papers (2021-07-10T00:01:45Z) - Developing Constrained Neural Units Over Time [81.19349325749037]
This paper focuses on an alternative way of defining Neural Networks, that is different from the majority of existing approaches.
The structure of the neural architecture is defined by means of a special class of constraints that are extended also to the interaction with data.
The proposed theory is cast into the time domain, in which data are presented to the network in an ordered manner.
arXiv Detail & Related papers (2020-09-01T09:07:25Z) - A Chain Graph Interpretation of Real-World Neural Networks [58.78692706974121]
We propose an alternative interpretation that identifies NNs as chain graphs (CGs) and feed-forward as an approximate inference procedure.
The CG interpretation specifies the nature of each NN component within the rich theoretical framework of probabilistic graphical models.
We demonstrate with concrete examples that the CG interpretation can provide novel theoretical support and insights for various NN techniques.
arXiv Detail & Related papers (2020-06-30T14:46:08Z) - Expressiveness and machine processability of Knowledge Organization
Systems (KOS): An analysis of concepts and relations [0.0]
The potential of both the expressiveness and machine processability of each Knowledge Organization System is extensively regulated by its structural rules.
Ontologies explicitly define diverse types of relations, and are by their nature machine-processable.
arXiv Detail & Related papers (2020-03-11T12:35:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.