Regular-pattern-sensitive CRFs for Distant Label Interactions
- URL: http://arxiv.org/abs/2411.12484v1
- Date: Tue, 19 Nov 2024 13:08:03 GMT
- Title: Regular-pattern-sensitive CRFs for Distant Label Interactions
- Authors: Sean Papay, Roman Klinger, Sebastian Pado,
- Abstract summary: Regular-pattern-sensitive CRFs (RPCRFs) are a method of enriching standard linear-chain CRFs with the ability to learn long-distance label interactions.
We show how a RPCRF can be automatically constructed from a set of user-specified patterns, and demonstrate the model's effectiveness on synthetic data.
- Score: 10.64258723923874
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Linear-chain conditional random fields (CRFs) are a common model component for sequence labeling tasks when modeling the interactions between different labels is important. However, the Markov assumption limits linear-chain CRFs to only directly modeling interactions between adjacent labels. Weighted finite-state transducers (FSTs) are a related approach which can be made to model distant label-label interactions, but exact label inference is intractable for these models in the general case, and the task of selecting an appropriate automaton structure for the desired interaction types poses a practical challenge. In this work, we present regular-pattern-sensitive CRFs (RPCRFs), a method of enriching standard linear-chain CRFs with the ability to learn long-distance label interactions which occur in user-specified patterns. This approach allows users to write regular-expression label patterns concisely specifying which types of interactions the model should take into account, allowing the model to learn from data whether and in which contexts these patterns occur. The result can be interpreted alternatively as a CRF augmented with additional, non-local potentials, or as a finite-state transducer whose structure is defined by a set of easily-interpretable patterns. Critically, unlike the general case for FSTs (and for non-chain CRFs), exact training and inference are tractable for many pattern sets. In this work, we detail how a RPCRF can be automatically constructed from a set of user-specified patterns, and demonstrate the model's effectiveness on synthetic data, showing how different types of patterns can capture different nonlocal dependency structures in label sequences.
Related papers
- FreDF: Learning to Forecast in Frequency Domain [56.24773675942897]
Time series modeling is uniquely challenged by the presence of autocorrelation in both historical and label sequences.
We introduce the Frequency-enhanced Direct Forecast (FreDF) which bypasses the complexity of label autocorrelation by learning to forecast in the frequency domain.
arXiv Detail & Related papers (2024-02-04T08:23:41Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - Leveraging Instance Features for Label Aggregation in Programmatic Weak
Supervision [75.1860418333995]
Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently.
The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources as labeling functions.
Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process.
arXiv Detail & Related papers (2022-10-06T07:28:53Z) - Dependency Structure Misspecification in Multi-Source Weak Supervision
Models [15.125993628007972]
We study the effects of label model misspecification on test set performance of a downstream classifier.
We derive novel theoretical bounds on the modeling error and empirically show that this error can be substantial.
arXiv Detail & Related papers (2021-06-18T18:15:44Z) - A Unified Generative Adversarial Network Training via Self-Labeling and
Self-Attention [38.31735499785227]
We propose a novel GAN training scheme that can handle any level of labeling in a unified manner.
Our scheme introduces a form of artificial labeling that can incorporate manually defined labels, when available.
We evaluate our approach on CIFAR-10, STL-10 and SVHN, and show that both self-labeling and self-attention consistently improve the quality of generated data.
arXiv Detail & Related papers (2021-06-18T04:40:26Z) - Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition [55.362258027878966]
We present momentum pseudo-labeling (MPL) as a simple yet effective strategy for semi-supervised speech recognition.
MPL consists of a pair of online and offline models that interact and learn from each other, inspired by the mean teacher method.
The experimental results demonstrate that MPL effectively improves over the base model and is scalable to different semi-supervised scenarios.
arXiv Detail & Related papers (2021-06-16T16:24:55Z) - Constraining Linear-chain CRFs to Regular Languages [10.759863489447204]
A major challenge in structured prediction is to represent the interdependencies within output structures.
We present a generalization of CRFs that can enforce a broad class of constraints, including nonlocal ones.
We prove that constrained training is never worse than constrained decoding, and show empirically that it can be substantially better in practice.
arXiv Detail & Related papers (2021-06-14T11:23:59Z) - Equivalence of Segmental and Neural Transducer Modeling: A Proof of
Concept [56.46135010588918]
We prove that the widely used class of RNN-Transducer models and segmental models (direct HMM) are equivalent.
It is shown that blank probabilities translate into segment length probabilities and vice versa.
arXiv Detail & Related papers (2021-04-13T11:20:48Z) - Label Confusion Learning to Enhance Text Classification Models [3.0251266104313643]
Label Confusion Model (LCM) learns label confusion to capture semantic overlap among labels.
LCM can generate a better label distribution to replace the original one-hot label vector.
experiments on five text classification benchmark datasets reveal the effectiveness of LCM for several widely used deep learning classification models.
arXiv Detail & Related papers (2020-12-09T11:34:35Z) - Neural Latent Dependency Model for Sequence Labeling [47.32215014130811]
A classic approach to sequence labeling is linear chain conditional random fields (CRFs)
One limitation of linear chain CRFs is their inability to model long-range dependencies between labels.
High order CRFs extend linear chain CRFs by no longer than their order, but the computational complexity grows exponentially in the order.
We propose a Neural Latent Dependency Model (NLDM) that models arbitrary length between labels with a latent tree structure.
arXiv Detail & Related papers (2020-11-10T10:05:21Z) - Semi-Supervised Speech Recognition via Graph-based Temporal
Classification [59.58318952000571]
Semi-supervised learning has demonstrated promising results in automatic speech recognition by self-training.
The effectiveness of this approach largely relies on the pseudo-label accuracy.
Alternative ASR hypotheses of an N-best list can provide more accurate labels for an unlabeled speech utterance.
arXiv Detail & Related papers (2020-10-29T14:56:56Z) - Robust Question Answering Through Sub-part Alignment [53.94003466761305]
We model question answering as an alignment problem.
We train our model on SQuAD v1.1 and test it on several adversarial and out-of-domain datasets.
arXiv Detail & Related papers (2020-04-30T09:10:57Z) - Multi-Label Text Classification using Attention-based Graph Neural
Network [0.0]
A graph attention network-based model is proposed to capture the attentive dependency structure among the labels.
The proposed model achieves similar or better performance compared to the previous state-of-the-art models.
arXiv Detail & Related papers (2020-03-22T17:12:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.