Modeling Label Correlations for Ultra-Fine Entity Typing with Neural
Pairwise Conditional Random Field
- URL: http://arxiv.org/abs/2212.01581v1
- Date: Sat, 3 Dec 2022 09:49:15 GMT
- Title: Modeling Label Correlations for Ultra-Fine Entity Typing with Neural
Pairwise Conditional Random Field
- Authors: Chengyue Jiang, Yong Jiang, Weiqi Wu, Pengjun Xie, Kewei Tu
- Abstract summary: We use an undirected graphical model called pairwise conditional random field (PCRF) to formulate the UFET problem.
We use various modern backbones for entity typing to compute unary potentials and derive pairwise potentials from type phrase representations.
We use mean-field variational inference for efficient type inference on very large type sets and unfold it as a neural network module to enable end-to-end training.
- Score: 47.22366788848256
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ultra-fine entity typing (UFET) aims to predict a wide range of type phrases
that correctly describe the categories of a given entity mention in a sentence.
Most recent works infer each entity type independently, ignoring the
correlations between types, e.g., when an entity is inferred as a president, it
should also be a politician and a leader. To this end, we use an undirected
graphical model called pairwise conditional random field (PCRF) to formulate
the UFET problem, in which the type variables are not only unarily influenced
by the input but also pairwisely relate to all the other type variables. We use
various modern backbones for entity typing to compute unary potentials, and
derive pairwise potentials from type phrase representations that both capture
prior semantic information and facilitate accelerated inference. We use
mean-field variational inference for efficient type inference on very large
type sets and unfold it as a neural network module to enable end-to-end
training. Experiments on UFET show that the Neural-PCRF consistently
outperforms its backbones with little cost and results in a competitive
performance against cross-encoder based SOTA while being thousands of times
faster. We also find Neural- PCRF effective on a widely used fine-grained
entity typing dataset with a smaller type set. We pack Neural-PCRF as a network
module that can be plugged onto multi-label type classifiers with ease and
release it in https://github.com/modelscope/adaseq/tree/master/examples/NPCRF.
Related papers
- Graph Neural Network Approach to Semantic Type Detection in Tables [3.929053351442136]
This study addresses the challenge of detecting semantic column types in relational tables.
We propose a novel approach using Graph Neural Networks (GNNs) to model intra-table dependencies.
Our proposed method not only outperforms existing state-of-the-art algorithms but also offers novel insights into the utility and functionality of various GNN types for semantic type detection.
arXiv Detail & Related papers (2024-04-30T18:17:44Z) - Probabilistic Transformer: A Probabilistic Dependency Model for
Contextual Word Representation [52.270712965271656]
We propose a new model of contextual word representation, not from a neural perspective, but from a purely syntactic and probabilistic perspective.
We find that the graph of our model resembles transformers, with correspondences between dependencies and self-attention.
Experiments show that our model performs competitively to transformers on small to medium sized datasets.
arXiv Detail & Related papers (2023-11-26T06:56:02Z) - Calibrated Seq2seq Models for Efficient and Generalizable Ultra-fine
Entity Typing [10.08153231108538]
We present CASENT, a seq2seq model designed for ultra-fine entity typing.
Our model takes an entity mention as input and employs constrained beam search to generate multiple types autoregressively.
Our method outperforms the previous state-of-the-art in terms of F1 score and calibration error, while achieving an inference speedup of over 50 times.
arXiv Detail & Related papers (2023-11-01T20:39:12Z) - Neural networks for insurance pricing with frequency and severity data: a benchmark study from data preprocessing to technical tariff [2.4578723416255754]
We present a benchmark study on four insurance data sets with frequency and severity targets in the presence of multiple types of input features.
We compare in detail the performance of a generalized linear model on binned input data, a gradient-boosted tree model, a feed-forward neural network (FFNN), and the combined actuarial neural network (CANN)
arXiv Detail & Related papers (2023-10-19T12:00:33Z) - Recall, Expand and Multi-Candidate Cross-Encode: Fast and Accurate
Ultra-Fine Entity Typing [46.85183839946139]
State-of-the-art (SOTA) methods use the cross-encoder (CE) based architecture.
We use a novel model called MCCE to concurrently encode and score these K candidates.
We also found MCCE is very effective in fine-grained (130 types) and coarse-grained (9 types) entity typing.
arXiv Detail & Related papers (2022-12-18T16:42:52Z) - Discovering Invariant Rationales for Graph Neural Networks [104.61908788639052]
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features.
We propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs.
arXiv Detail & Related papers (2022-01-30T16:43:40Z) - Cherry-Picking Gradients: Learning Low-Rank Embeddings of Visual Data
via Differentiable Cross-Approximation [53.95297550117153]
We propose an end-to-end trainable framework that processes large-scale visual data tensors by looking emphat a fraction of their entries only.
The proposed approach is particularly useful for large-scale multidimensional grid data, and for tasks that require context over a large receptive field.
arXiv Detail & Related papers (2021-05-29T08:39:57Z) - A Correspondence Variational Autoencoder for Unsupervised Acoustic Word
Embeddings [50.524054820564395]
We propose a new unsupervised model for mapping a variable-duration speech segment to a fixed-dimensional representation.
The resulting acoustic word embeddings can form the basis of search, discovery, and indexing systems for low- and zero-resource languages.
arXiv Detail & Related papers (2020-12-03T19:24:42Z) - Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning.
We propose a novel method of using data augmentations when training autoencoders.
We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z) - PushNet: Efficient and Adaptive Neural Message Passing [1.9121961872220468]
Message passing neural networks have recently evolved into a state-of-the-art approach to representation learning on graphs.
Existing methods perform synchronous message passing along all edges in multiple subsequent rounds.
We consider a novel asynchronous message passing approach where information is pushed only along the most relevant edges until convergence.
arXiv Detail & Related papers (2020-03-04T18:15:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.