Transformer-GCRF: Recovering Chinese Dropped Pronouns with General
Conditional Random Fields
- URL: http://arxiv.org/abs/2010.03224v1
- Date: Wed, 7 Oct 2020 07:06:09 GMT
- Title: Transformer-GCRF: Recovering Chinese Dropped Pronouns with General
Conditional Random Fields
- Authors: Jingxuan Yang, Kerui Xu, Jun Xu, Si Li, Sheng Gao, Jun Guo, Ji-Rong
Wen, Nianwen Xue
- Abstract summary: We present a novel framework that combines the strength of Transformer network with General Conditional Random Fields (GCRF) to model the dependencies between pronouns in neighboring utterances.
Results on three Chinese conversation datasets show that the Transformer-GCRF model outperforms the state-of-the-art dropped pronoun recovery models.
- Score: 54.03719496661691
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pronouns are often dropped in Chinese conversations and recovering the
dropped pronouns is important for NLP applications such as Machine Translation.
Existing approaches usually formulate this as a sequence labeling task of
predicting whether there is a dropped pronoun before each token and its type.
Each utterance is considered to be a sequence and labeled independently.
Although these approaches have shown promise, labeling each utterance
independently ignores the dependencies between pronouns in neighboring
utterances. Modeling these dependencies is critical to improving the
performance of dropped pronoun recovery. In this paper, we present a novel
framework that combines the strength of Transformer network with General
Conditional Random Fields (GCRF) to model the dependencies between pronouns in
neighboring utterances. Results on three Chinese conversation datasets show
that the Transformer-GCRF model outperforms the state-of-the-art dropped
pronoun recovery models. Exploratory analysis also demonstrates that the GCRF
did help to capture the dependencies between pronouns in neighboring
utterances, thus contributes to the performance improvements.
Related papers
- Constructing Cloze Questions Generatively [2.2719421441459406]
We present a generative method for constructing cloze questions from an article using neural networks and WordNet.
CQG selects an answer key for a given sentence, segments it into a sequence of instances, generates instance-level distractor candidates (IDCs) using a transformer and sibling synsets.
It then removes inappropriate IDCs, ranks the remaining IDCs based on contextual embedding similarities, as well as synset and lexical relatedness, forms distractor candidates by replacing instances with the corresponding top-ranked IDCs, and checks if they are legitimate phrases.
arXiv Detail & Related papers (2024-10-05T18:55:38Z) - Robust Pronoun Fidelity with English LLMs: Are they Reasoning, Repeating, or Just Biased? [26.583741801345507]
We present a dataset of over 5 million instances to measure pronoun fidelity in English.
Our results show that pronoun fidelity is not robust, in a simple, naturalistic setting where humans achieve nearly 100% accuracy.
arXiv Detail & Related papers (2024-04-04T01:07:14Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Causal interventions expose implicit situation models for commonsense
language understanding [3.290878132806227]
We analyze performance on the Winograd Challenge, where a single context cue shifts interpretation of an ambiguous pronoun.
We identify a circuit of attention heads that are responsible for propagating information from the context word.
These analyses suggest distinct pathways through which implicit situation models are constructed to guide pronoun resolution.
arXiv Detail & Related papers (2023-06-06T17:36:43Z) - Mapping of attention mechanisms to a generalized Potts model [50.91742043564049]
We show that training a neural network is exactly equivalent to solving the inverse Potts problem by the so-called pseudo-likelihood method.
We also compute the generalization error of self-attention in a model scenario analytically using the replica method.
arXiv Detail & Related papers (2023-04-14T16:32:56Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - Joint Entity and Relation Canonicalization in Open Knowledge Graphs
using Variational Autoencoders [11.259587284318835]
Noun phrases and relation phrases in open knowledge graphs are not canonicalized, leading to an explosion of redundant and ambiguous subject-relation-object triples.
Existing approaches to face this problem take a two-step approach: first, they generate embedding representations for both noun and relation phrases, then a clustering algorithm is used to group them using the embeddings as features.
In this work, we propose Canonicalizing Using Variational AutoEncoders (CUVA), a joint model to learn both embeddings and cluster assignments in an end-to-end approach.
arXiv Detail & Related papers (2020-12-08T22:58:30Z) - A Brief Survey and Comparative Study of Recent Development of Pronoun
Coreference Resolution [55.39835612617972]
Pronoun Coreference Resolution (PCR) is the task of resolving pronominal expressions to all mentions they refer to.
As one important natural language understanding (NLU) component, pronoun resolution is crucial for many downstream tasks and still challenging for existing models.
We conduct extensive experiments to show that even though current models are achieving good performance on the standard evaluation set, they are still not ready to be used in real applications.
arXiv Detail & Related papers (2020-09-27T01:40:01Z) - Learning Source Phrase Representations for Neural Machine Translation [65.94387047871648]
We propose an attentive phrase representation generation mechanism which is able to generate phrase representations from corresponding token representations.
In our experiments, we obtain significant improvements on the WMT 14 English-German and English-French tasks on top of the strong Transformer baseline.
arXiv Detail & Related papers (2020-06-25T13:43:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.