Related papers: An Investigation of Potential Function Designs for Neural CRF

An Investigation of Potential Function Designs for Neural CRF

URL: http://arxiv.org/abs/2011.05604v1
Date: Wed, 11 Nov 2020 07:32:18 GMT
Title: An Investigation of Potential Function Designs for Neural CRF
Authors: Zechuan Hu, Yong Jiang, Nguyen Bach, Tao Wang, Zhongqiang Huang, Fei Huang, Kewei Tu
Abstract summary: In this paper, we investigate a series of increasingly expressive potential functions for neural CRF models. Our experiments show that the decomposed quadrilinear potential function based on the vector representations of two neighboring labels and two neighboring words consistently achieves the best performance.
Score: 75.79555356970344
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The neural linear-chain CRF model is one of the most widely-used approach to sequence labeling. In this paper, we investigate a series of increasingly expressive potential functions for neural CRF models, which not only integrate the emission and transition functions, but also explicitly take the representations of the contextual words as input. Our extensive experiments show that the decomposed quadrilinear potential function based on the vector representations of two neighboring labels and two neighboring words consistently achieves the best performance.

Related papers

Function Forms of Simple ReLU Networks with Random Hidden Weights [1.2289361708127877]
We investigate the function space dynamics of a two-layer ReLU neural network in the infinite-width limit.<n>We highlight the Fisher information matrix's role in steering learning.<n>This work offers a robust foundation for understanding wide neural networks.
arXiv Detail & Related papers (2025-05-23T13:53:02Z)
Random Feature Models with Learnable Activation Functions [10.908603300691064]
We introduce the Random Feature model with Learnable Activation Functions (RFLAF) RFLAF significantly enhances the expressivity and interpretability of traditional random feature (RF) models. Our model paves the way for developing more expressive and interpretable frameworks within random feature models.
arXiv Detail & Related papers (2024-11-29T04:38:12Z)
Interpretable Language Modeling via Induction-head Ngram Models [74.26720927767398]
We propose Induction-head ngram models (Induction-Gram) to bolster modern ngram models with a hand-engineered "induction head" This induction head uses a custom neural similarity metric to efficiently search the model's input context for potential next-word completions. Experiments show that this simple method significantly improves next-word prediction over baseline interpretable models.
arXiv Detail & Related papers (2024-10-31T12:33:26Z)
Feature Mapping in Physics-Informed Neural Networks (PINNs) [1.9819034119774483]
We study the training dynamics of PINNs with a feature mapping layer via the limiting Conjugate Kernel and Neural Tangent Kernel. We propose conditionally positive definite Radial Basis Function as a better alternative.
arXiv Detail & Related papers (2024-02-10T13:51:09Z)
Neural-Hidden-CRF: A Robust Weakly-Supervised Sequence Labeler [15.603945748109743]
We propose a neuralized undirected graphical model called Neural-Hidden-CRF to solve the weakly-supervised sequence labeling problem. Under the umbrella of probabilistic undirected graph theory, the proposed Neural-Hidden-CRF embedded with a hidden CRF layer models the variables of word sequence, latent ground truth sequence, and weak label sequence.
arXiv Detail & Related papers (2023-09-10T17:13:25Z)
ENN: A Neural Network with DCT Adaptive Activation Functions [2.2713084727838115]
We present Expressive Neural Network (ENN), a novel model in which the non-linear activation functions are modeled using the Discrete Cosine Transform (DCT) This parametrization keeps the number of trainable parameters low, is appropriate for gradient-based schemes, and adapts to different learning tasks. The performance of ENN outperforms state of the art benchmarks, providing above a 40% gap in accuracy in some scenarios.
arXiv Detail & Related papers (2023-07-02T21:46:30Z)
Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z)
First Power Linear Unit with Sign [0.0]
It is enlightened by common inverse operation while endowed with an intuitive meaning of bionics. We extend the function presented to a more generalized type called PFPLUS with two parameters that can be fixed or learnable.
arXiv Detail & Related papers (2021-11-29T06:47:58Z)
Going Beyond Linear RL: Sample Efficient Neural Function Approximation [76.57464214864756]
We study function approximation with two-layer neural networks. Our results significantly improve upon what can be attained with linear (or eluder dimension) methods.
arXiv Detail & Related papers (2021-07-14T03:03:56Z)
High-dimensional Functional Graphical Model Structure Learning via Neighborhood Selection Approach [15.334392442475115]
We propose a neighborhood selection approach to estimate the structure of functional graphical models. We thus circumvent the need for a well-defined precision operator that may not exist when the functions are infinite dimensional.
arXiv Detail & Related papers (2021-05-06T07:38:50Z)
Probabilistic Graph Attention Network with Conditional Kernels for Pixel-Wise Prediction [158.88345945211185]
We present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion. We propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner.
arXiv Detail & Related papers (2021-01-08T04:14:29Z)
UNIPoint: Universally Approximating Point Processes Intensities [125.08205865536577]
We provide a proof that a class of learnable functions can universally approximate any valid intensity function. We implement UNIPoint, a novel neural point process model, using recurrent neural networks to parameterise sums of basis function upon each event.
arXiv Detail & Related papers (2020-07-28T09:31:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.