POTATO: exPlainable infOrmation exTrAcTion framewOrk
- URL: http://arxiv.org/abs/2201.13230v1
- Date: Mon, 31 Jan 2022 13:43:02 GMT
- Title: POTATO: exPlainable infOrmation exTrAcTion framewOrk
- Authors: \'Ad\'am Kov\'acs, Kinga G\'emes, Eszter Ikl\'odi, G\'abor Recski
- Abstract summary: We present POTATO, a task- and languageindependent framework for human-in-the-loop (HITL) learning of rule-based text classifiers using graph-based features.
A streamlit-based user interface allows users to build rule systems from graph patterns, provides real-time evaluation based on ground truth data, and suggests rules by ranking graph features using interpretable machine learning models.
POTATO is applied in projects across domains and languages, including classification tasks on German legal text and English social media data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present POTATO, a task- and languageindependent framework for
human-in-the-loop (HITL) learning of rule-based text classifiers using
graph-based features. POTATO handles any type of directed graph and supports
parsing text into Abstract Meaning Representations (AMR), Universal
Dependencies (UD), and 4lang semantic graphs. A streamlit-based user interface
allows users to build rule systems from graph patterns, provides real-time
evaluation based on ground truth data, and suggests rules by ranking graph
features using interpretable machine learning models. Users can also provide
patterns over graphs using regular expressions, and POTATO can recommend
refinements of such rules. POTATO is applied in projects across domains and
languages, including classification tasks on German legal text and English
social media data. All components of our system are written in Python, can be
installed via pip, and are released under an MIT License on GitHub.
Related papers
- LLM as GNN: Graph Vocabulary Learning for Text-Attributed Graph Foundation Models [54.82915844507371]
Text-Attributed Graphs (TAGs) are ubiquitous in real-world scenarios.
Despite large efforts to integrate Large Language Models (LLMs) and Graph Neural Networks (GNNs) for TAGs, existing approaches suffer from decoupled architectures.
We propose PromptGFM, a versatile GFM for TAGs grounded in graph vocabulary learning.
arXiv Detail & Related papers (2025-03-05T09:45:22Z) - Can LLMs Convert Graphs to Text-Attributed Graphs? [35.53046810556242]
We propose Topology-Aware Node description Synthesis (TANS) to convert existing graphs into text-attributed graphs.
We evaluate our TANS on text-rich, text-limited, and text-free graphs, demonstrating its applicability.
arXiv Detail & Related papers (2024-12-13T13:32:59Z) - UniGLM: Training One Unified Language Model for Text-Attributed Graphs [31.464021556351685]
Unified Graph Language Model (UniGLM) is a graph embedding model that generalizes well to both in-domain and cross-domain TAGs.
UniGLM includes an adaptive positive sample selection technique for identifying structurally similar nodes and a lazy contrastive module that is devised to accelerate training.
arXiv Detail & Related papers (2024-06-17T19:45:21Z) - G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [61.93058781222079]
We develop a flexible question-answering framework targeting real-world textual graphs.
We introduce the first retrieval-augmented generation (RAG) approach for general textual graphs.
G-Retriever performs RAG over a graph by formulating this task as a Prize-Collecting Steiner Tree optimization problem.
arXiv Detail & Related papers (2024-02-12T13:13:04Z) - Large Language Models on Graphs: A Comprehensive Survey [77.16803297418201]
We provide a systematic review of scenarios and techniques related to large language models on graphs.
We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs.
We discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets.
arXiv Detail & Related papers (2023-12-05T14:14:27Z) - Learning Multiplex Representations on Text-Attributed Graphs with One Language Model Encoder [55.24276913049635]
We propose METAG, a new framework for learning Multiplex rEpresentations on Text-Attributed Graphs.
In contrast to existing methods, METAG uses one text encoder to model the shared knowledge across relations.
We conduct experiments on nine downstream tasks in five graphs from both academic and e-commerce domains.
arXiv Detail & Related papers (2023-10-10T14:59:22Z) - Generating Semantic Graph Corpora with Graph Expansion Grammar [0.0]
Lovelace is a tool for creating corpora of semantic graphs.
The system uses graph expansion grammar as a representational language.
Central use cases are the creation of synthetic data to augment existing corpora, and as a pedagogical tool for teaching formal language theory.
arXiv Detail & Related papers (2023-09-15T19:10:19Z) - ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings [20.25180279903009]
We propose Contrastive Graph-Text pretraining (ConGraT) for jointly learning separate representations of texts and nodes in a text-attributed graph (TAG)
Our method trains a language model (LM) and a graph neural network (GNN) to align their representations in a common latent space using a batch-wise contrastive learning objective inspired by CLIP.
Experiments demonstrate that ConGraT outperforms baselines on various downstream tasks, including node and text category classification, link prediction, and language modeling.
arXiv Detail & Related papers (2023-05-23T17:53:30Z) - A Case Study for Compliance as Code with Graphs and Language Models:
Public release of the Regulatory Knowledge Graph [0.0]
The paper focuses on Abu Dhabi Global Market regulations and taxonomy.
It involves manual tagging a portion of the regulations, training BERT-based models, which are then applied to the rest of the corpus.
Coreference resolution and syntax analysis were used to parse the relationships between the tagged entities.
arXiv Detail & Related papers (2023-02-03T16:37:08Z) - GraphQ IR: Unifying Semantic Parsing of Graph Query Language with
Intermediate Representation [91.27083732371453]
We propose a unified intermediate representation (IR) for graph query languages, namely GraphQ IR.
With the IR's natural-language-like representation that bridges the semantic gap and its formally defined syntax that maintains the graph structure, neural semantic parsing can more effectively convert user queries into GraphQ IR.
Our approach can consistently achieve state-of-the-art performance on KQA Pro, Overnight and MetaQA.
arXiv Detail & Related papers (2022-05-24T13:59:53Z) - GraphFormers: GNN-nested Transformers for Representation Learning on
Textual Graph [53.70520466556453]
We propose GraphFormers, where layerwise GNN components are nested alongside the transformer blocks of language models.
With the proposed architecture, the text encoding and the graph aggregation are fused into an iterative workflow.
In addition, a progressive learning strategy is introduced, where the model is successively trained on manipulated data and original data to reinforce its capability of integrating information on graph.
arXiv Detail & Related papers (2021-05-06T12:20:41Z) - Coordinate Constructions in English Enhanced Universal Dependencies:
Analysis and Computational Modeling [1.9950682531209154]
We address the representation of coordinate constructions in Enhanced Universal Dependencies (UD)
We create a large-scale dataset of manually edited syntax graphs.
We identify several systematic errors in the original data, and propose to also propagate adjuncts.
arXiv Detail & Related papers (2021-03-16T10:24:27Z) - Automatic Extraction of Rules Governing Morphological Agreement [103.78033184221373]
We develop an automated framework for extracting a first-pass grammatical specification from raw text.
We focus on extracting rules describing agreement, a morphosyntactic phenomenon at the core of the grammars of many of the world's languages.
We apply our framework to all languages included in the Universal Dependencies project, with promising results.
arXiv Detail & Related papers (2020-10-02T18:31:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.