Convex Polytope Modelling for Unsupervised Derivation of Semantic
Structure for Data-efficient Natural Language Understanding
- URL: http://arxiv.org/abs/2201.10588v1
- Date: Tue, 25 Jan 2022 19:12:44 GMT
- Title: Convex Polytope Modelling for Unsupervised Derivation of Semantic
Structure for Data-efficient Natural Language Understanding
- Authors: Jingyan Zhou, Xiaohan Feng, King Keung Wu, Helen Meng
- Abstract summary: A Convex-Polytopic-Model-based framework shows great potential in automatically extracting semantic patterns by exploiting the raw dialog corpus.
We show that this framework can exploit semantic-frame-related features in the corpus, reveal the underlying semantic structure of the utterances, and boost the performance of the state-of-the-art NLU model with minimal supervision.
- Score: 31.888489552069146
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Popular approaches for Natural Language Understanding (NLU) usually rely on a
huge amount of annotated data or handcrafted rules, which is laborious and not
adaptive to domain extension. We recently proposed a
Convex-Polytopic-Model-based framework that shows great potential in
automatically extracting semantic patterns by exploiting the raw dialog corpus.
The extracted semantic patterns can be used to generate semantic frames, which
is essential in assisting NLU tasks. This paper further studies the CPM model
in depth and visualizes its high interpretability and transparency at various
levels. We show that this framework can exploit
semantic-frame-related features in the corpus, reveal the underlying semantic
structure of the utterances, and boost the performance of the state-of-the-art
NLU model with minimal supervision. We conduct our experiments on the ATIS (Air
Travel Information System) corpus.
Related papers
- Deep Sparse Latent Feature Models for Knowledge Graph Completion [24.342670268545085]
In this paper, we introduce a novel framework of sparse latent feature models for knowledge graphs.
Our approach not only effectively completes missing triples but also provides clear interpretability of the latent structures.
Our method significantly improves performance by revealing latent communities and producing interpretable representations.
arXiv Detail & Related papers (2024-11-24T03:17:37Z) - HIP: Hierarchical Point Modeling and Pre-training for Visual Information Extraction [24.46493675079128]
OCR-dependent methods rely on offline OCR engines, while OCR-free methods might produce outputs that lack interpretability or contain hallucinated content.
We propose HIP, which models entities as HIerarchical Points to better conform to the hierarchical nature of the end-to-end VIE task.
Specifically, such hierarchical points can be flexibly encoded and subsequently decoded into desired text transcripts, centers of various regions, and categories of entities.
arXiv Detail & Related papers (2024-11-02T05:00:13Z) - Learning Hierarchical Prompt with Structured Linguistic Knowledge for
Vision-Language Models [43.56153167864033]
We propose a novel approach to harnessing structured knowledge in large language models (LLMs)
We introduce a relationship-guided attention module to capture pair-wise associations among entities and attributes for low-level prompt learning.
In addition, by incorporating high-level and global-level prompts, the proposed hierarchical structure forges cross-level interlinks and empowers the model to handle more complex and long-term relationships.
arXiv Detail & Related papers (2023-12-11T12:14:06Z) - Constructing Word-Context-Coupled Space Aligned with Associative
Knowledge Relations for Interpretable Language Modeling [0.0]
The black-box structure of the deep neural network in pre-trained language models seriously limits the interpretability of the language modeling process.
A Word-Context-Coupled Space (W2CSpace) is proposed by introducing the alignment processing between uninterpretable neural representation and interpretable statistical logic.
Our language model can achieve better performance and highly credible interpretable ability compared to related state-of-the-art methods.
arXiv Detail & Related papers (2023-05-19T09:26:02Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Guiding the PLMs with Semantic Anchors as Intermediate Supervision:
Towards Interpretable Semantic Parsing [57.11806632758607]
We propose to incorporate the current pretrained language models with a hierarchical decoder network.
By taking the first-principle structures as the semantic anchors, we propose two novel intermediate supervision tasks.
We conduct intensive experiments on several semantic parsing benchmarks and demonstrate that our approach can consistently outperform the baselines.
arXiv Detail & Related papers (2022-10-04T07:27:29Z) - Nested Named Entity Recognition as Holistic Structure Parsing [92.8397338250383]
This work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all.
Experiments show that our model yields promising results on widely-used benchmarks which approach or even achieve state-of-the-art.
arXiv Detail & Related papers (2022-04-17T12:48:20Z) - Syntactic and Semantic-driven Learning for Open Information Extraction [42.65591370263333]
One of the biggest bottlenecks in building accurate, high coverage neural open IE systems is the need for large labelled corpora.
We propose a syntactic and semantic-driven learning approach, which can learn neural open IE models without any human-labelled data.
arXiv Detail & Related papers (2021-03-05T02:59:40Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - A Dependency Syntactic Knowledge Augmented Interactive Architecture for
End-to-End Aspect-based Sentiment Analysis [73.74885246830611]
We propose a novel dependency syntactic knowledge augmented interactive architecture with multi-task learning for end-to-end ABSA.
This model is capable of fully exploiting the syntactic knowledge (dependency relations and types) by leveraging a well-designed Dependency Relation Embedded Graph Convolutional Network (DreGcn)
Extensive experimental results on three benchmark datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-04-04T14:59:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.