Reducing Formal Context Extraction: A Newly Proposed Framework from Big Corpora
- URL: http://arxiv.org/abs/2504.06285v1
- Date: Tue, 01 Apr 2025 09:24:07 GMT
- Title: Reducing Formal Context Extraction: A Newly Proposed Framework from Big Corpora
- Authors: Bryar A. Hassan, Shko M. Qader, Alla A. Hassan, Joan Lu, Aram M. Ahmed, Jafar Majidpour, Tarik A. Rashid,
- Abstract summary: This study proposes a framework for reducing formal context in extracting concept hierarchies from free text.<n>We achieve this by reducing the size of the formal context using a hybrid of a WordNet-based method and a frequency-based technique.
- Score: 5.045556232232993
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automating the extraction of concept hierarchies from free text is advantageous because manual generation is frequently labor- and resource-intensive. Free result, the whole procedure for concept hierarchy learning from free text entails several phases, including sentence-level text processing, sentence splitting, and tokenization. Lemmatization is after formal context analysis (FCA) to derive the pairings. Nevertheless, there could be a few uninteresting and incorrect pairings in the formal context. It may take a while to generate formal context; thus, size reduction formal context is necessary to weed out irrelevant and incorrect pairings to extract the concept lattice and hierarchies more quickly. This study aims to propose a framework for reducing formal context in extracting concept hierarchies from free text to reduce the ambiguity of the formal context. We achieve this by reducing the size of the formal context using a hybrid of a WordNet-based method and a frequency-based technique. Using 385 samples from the Wikipedia corpus and the suggested framework, tests are carried out to examine the reduced size of formal context, leading to concept lattice and concept hierarchy. With the help of concept lattice-invariants, the generated formal context lattice is compared to the normal one. In contrast to basic ones, the homomorphic between the resultant lattices retains up to 98% of the quality of the generating concept hierarchies, and the reduced concept lattice receives the structural connection of the standard one. Additionally, the new framework is compared to five baseline techniques to calculate the running time on random datasets with various densities. The findings demonstrate that, in various fill ratios, hybrid approaches of the proposed method outperform other indicated competing strategies in concept lattice performance.
Related papers
- A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition [32.142713322062306]
Text recognition systems often depend on large end-to-end architectures that require extensive training and are prohibitively expensive for real-time scenarios.<n>We propose a training-free plug-and-play framework that leverages the strengths of pre-trained text recognizers while minimizing redundant computations.<n>Our approach uses context-based understanding and introduces an attention-based segmentation stage, which refines candidate text regions at the pixel level.
arXiv Detail & Related papers (2025-03-19T18:51:01Z) - Ontology Learning Using Formal Concept Analysis and WordNet [0.9065034043031668]
This project and dissertation provide a Formal Concept Analysis and WordNet framework for learning concept hierarchies from free texts.
We compute formal idea lattice and create a classical concept hierarchy.
Despite several system constraints and component discrepancies that may prevent logical conclusion, the following data imply idea hierarchies in this project and dissertation are promising.
arXiv Detail & Related papers (2023-11-10T08:28:30Z) - RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine Semantic Re-alignment [112.45442468794658]
We propose a two-stage coarse-to-fine semantic re-alignment method, named RealignDiff.
In the coarse semantic re-alignment phase, a novel caption reward is proposed to evaluate the semantic discrepancy between the generated image caption and the given text prompt.
The fine semantic re-alignment stage employs a local dense caption generation module and a re-weighting attention modulation module to refine the previously generated images from a local semantic view.
arXiv Detail & Related papers (2023-05-31T06:59:21Z) - Conjunct Resolution in the Face of Verbal Omissions [51.220650412095665]
We propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure.
We curate a large dataset, containing over 10K examples of naturally-occurring verbal omissions with crowd-sourced annotations.
We train various neural baselines for this task, and show that while our best method obtains decent performance, it leaves ample space for improvement.
arXiv Detail & Related papers (2023-05-26T08:44:02Z) - Variational Cross-Graph Reasoning and Adaptive Structured Semantics
Learning for Compositional Temporal Grounding [143.5927158318524]
Temporal grounding is the task of locating a specific segment from an untrimmed video according to a query sentence.
We introduce a new Compositional Temporal Grounding task and construct two new dataset splits.
We argue that the inherent structured semantics inside the videos and language is the crucial factor to achieve compositional generalization.
arXiv Detail & Related papers (2023-01-22T08:02:23Z) - Progressive Tree-Structured Prototype Network for End-to-End Image
Captioning [74.8547752611337]
We propose a novel Progressive Tree-Structured prototype Network (dubbed PTSN)
PTSN is the first attempt to narrow down the scope of prediction words with appropriate semantics by modeling the hierarchical textual semantics.
Our method achieves a new state-of-the-art performance with 144.2% (single model) and 146.5% (ensemble of 4 models) CIDEr scores on Karpathy' split and 141.4% (c5) and 143.9% (c40) CIDEr scores on the official online test server.
arXiv Detail & Related papers (2022-11-17T11:04:00Z) - The Whole Truth and Nothing But the Truth: Faithful and Controllable
Dialogue Response Generation with Dataflow Transduction and Constrained
Decoding [65.34601470417967]
We describe a hybrid architecture for dialogue response generation that combines the strengths of neural language modeling and rule-based generation.
Our experiments show that this system outperforms both rule-based and learned approaches in human evaluations of fluency, relevance, and truthfulness.
arXiv Detail & Related papers (2022-09-16T09:00:49Z) - Keywords and Instances: A Hierarchical Contrastive Learning Framework Unifying Hybrid Granularities for Text Generation [60.62039705180484]
We propose a hierarchical contrastive learning mechanism, which can unify hybrid granularities semantic meaning in the input text.
Experiments demonstrate that our model outperforms competitive baselines on paraphrasing, dialogue generation, and storytelling tasks.
arXiv Detail & Related papers (2022-05-26T13:26:03Z) - Artificial Intelligence Algorithms for Natural Language Processing and
the Semantic Web Ontology Learning [0.76146285961466]
A new evolutionary clustering algorithm star (ECA*) is proposed.
Experiments were conducted to evaluate ECA* against five state-of-the-art approaches.
The results indicate that ECA* overcomes its competitive techniques in terms of the ability to find the right clusters.
arXiv Detail & Related papers (2021-08-31T11:57:41Z) - Formal context reduction in deriving concept hierarchies from corpora
using adaptive evolutionary clustering algorithm star [15.154538450706474]
The process of deriving concept hierarchies from corpora is typically a time-consuming and resource-intensive process.
The resulting lattice of formal context is evaluated to the standard one using concept lattice-invariants.
The results show that adaptive ECA* performs concept lattice faster than other mentioned competitive techniques in different fill ratios.
arXiv Detail & Related papers (2021-07-10T07:18:03Z) - Attribute Selection using Contranominal Scales [0.09668407688201358]
Formal Concept Analysis (FCA) allows to analyze binary data by deriving concepts and ordering them in lattices.
The size of such a lattice depends on the number of subcontexts in the corresponding formal context.
We propose the algorithm ContraFinder that enables the computation of all contranominal scales of a given formal context.
arXiv Detail & Related papers (2021-06-21T10:53:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.