An Investigation of Potential Function Designs for Neural CRF
- URL: http://arxiv.org/abs/2011.05604v1
- Date: Wed, 11 Nov 2020 07:32:18 GMT
- Title: An Investigation of Potential Function Designs for Neural CRF
- Authors: Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei
Huang, Kewei Tu
- Abstract summary: In this paper, we investigate a series of increasingly expressive potential functions for neural CRF models.
Our experiments show that the decomposed quadrilinear potential function based on the vector representations of two neighboring labels and two neighboring words consistently achieves the best performance.
- Score: 75.79555356970344
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The neural linear-chain CRF model is one of the most widely-used approach to
sequence labeling. In this paper, we investigate a series of increasingly
expressive potential functions for neural CRF models, which not only integrate
the emission and transition functions, but also explicitly take the
representations of the contextual words as input. Our extensive experiments
show that the decomposed quadrilinear potential function based on the vector
representations of two neighboring labels and two neighboring words
consistently achieves the best performance.
Related papers
- Interpretable Language Modeling via Induction-head Ngram Models [74.26720927767398]
We propose Induction-head ngram models (Induction-Gram) to bolster modern ngram models with a hand-engineered "induction head"
This induction head uses a custom neural similarity metric to efficiently search the model's input context for potential next-word completions.
Experiments show that this simple method significantly improves next-word prediction over baseline interpretable models.
arXiv Detail & Related papers (2024-10-31T12:33:26Z) - Feature Mapping in Physics-Informed Neural Networks (PINNs) [1.9819034119774483]
We study the training dynamics of PINNs with a feature mapping layer via the limiting Conjugate Kernel and Neural Tangent Kernel.
We propose conditionally positive definite Radial Basis Function as a better alternative.
arXiv Detail & Related papers (2024-02-10T13:51:09Z) - Neural-Hidden-CRF: A Robust Weakly-Supervised Sequence Labeler [15.603945748109743]
We propose a neuralized undirected graphical model called Neural-Hidden-CRF to solve the weakly-supervised sequence labeling problem.
Under the umbrella of probabilistic undirected graph theory, the proposed Neural-Hidden-CRF embedded with a hidden CRF layer models the variables of word sequence, latent ground truth sequence, and weak label sequence.
arXiv Detail & Related papers (2023-09-10T17:13:25Z) - ENN: A Neural Network with DCT Adaptive Activation Functions [2.2713084727838115]
We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT)
This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks.
The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
arXiv Detail & Related papers (2023-07-02T21:46:30Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - First Power Linear Unit with Sign [0.0]
It is enlightened by common inverse operation while endowed with an intuitive meaning of bionics.
We extend the function presented to a more generalized type called PFPLUS with two parameters that can be fixed or learnable.
arXiv Detail & Related papers (2021-11-29T06:47:58Z) - Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks.
Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z) - High-dimensional Functional Graphical Model Structure Learning via
Neighborhood Selection Approach [15.334392442475115]
We propose a neighborhood selection approach to estimate the structure of functional graphical models.
We thus circumvent the need for a well-defined precision operator that may not exist when the functions are infinite dimensional.
arXiv Detail & Related papers (2021-05-06T07:38:50Z) - Probabilistic Graph Attention Network with Conditional Kernels for
Pixel-Wise Prediction [158.88345945211185]
We present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion.
We propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner.
arXiv Detail & Related papers (2021-01-08T04:14:29Z) - UNIPoint: Universally Approximating Point Processes Intensities [125.08205865536577]
We provide a proof that a class of learnable functions can universally approximate any valid intensity function.
We implement UNIPoint, a novel neural point process model, using recurrent neural networks to parameterise sums of basis function upon each event.
arXiv Detail & Related papers (2020-07-28T09:31:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.