Hierarchies over Vector Space: Orienting Word and Graph Embeddings
- URL: http://arxiv.org/abs/2211.01430v2
- Date: Sat, 09 Nov 2024 20:52:16 GMT
- Title: Hierarchies over Vector Space: Orienting Word and Graph Embeddings
- Authors: Xingzhi Guo, Steven Skiena,
- Abstract summary: We present a data structure that captures inherent hierarchical properties from an unordered flat embedding space.
Inspired by the notion of textitdistributional generality, our algorithm constructs an arborescence (a directed rooted tree) by inserting nodes in descending order of entity power.
We evaluate the performance of the resulting tree structures on three tasks: hypernym relation discovery, least-common-ancestor (LCA) discovery among words, and Wikipedia page link recovery.
- Score: 14.367560758244624
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Word and graph embeddings are widely used in deep learning applications. We present a data structure that captures inherent hierarchical properties from an unordered flat embedding space, particularly a sense of direction between pairs of entities. Inspired by the notion of \textit{distributional generality}, our algorithm constructs an arborescence (a directed rooted tree) by inserting nodes in descending order of entity power (e.g., word frequency), pointing each entity to the closest more powerful node as its parent. We evaluate the performance of the resulting tree structures on three tasks: hypernym relation discovery, least-common-ancestor (LCA) discovery among words, and Wikipedia page link recovery. We achieve average 8.98\% and 2.70\% for hypernym and LCA discovery across five languages and 62.76\% accuracy on directed Wiki-page link recovery, with both substantially above baselines. Finally, we investigate the effect of insertion order, the power/similarity trade-off and various power sources to optimize parent selection.
Related papers
- Beyond Nearest Neighbors: Semantic Compression and Graph-Augmented Retrieval for Enhanced Vector Search [2.377892000761193]
We introduce a new retrieval paradigm: semantic compression, which aims to select a compact, representative set of vectors that captures the broader semantic structure around a query.<n>To operationalize this idea, we propose graph-augmented vector retrieval, which overlays semantic graphs (e.g., kNN or knowledge-based links) atop vector spaces.<n>Our work outlines a foundation for meaning-centric vector search systems, emphasizing hybrid indexing, diversity-aware querying, and structured semantic retrieval.
arXiv Detail & Related papers (2025-07-25T23:35:11Z) - Divide-Then-Rule: A Cluster-Driven Hierarchical Interpolator for Attribute-Missing Graphs [51.13363550716544]
Deep graph clustering is an unsupervised task aimed at partitioning nodes with incomplete attributes into distinct clusters.<n>Existing imputation methods for attribute-missing graphs often fail to account for the varying amounts of information available across node neighborhoods.<n>We propose Divide-Then-Rule Graph Completion (DTRGC) to address this issue.
arXiv Detail & Related papers (2025-07-12T03:33:19Z) - The Geometry of Meaning: Perfect Spacetime Representations of Hierarchical Structures [0.0]
We show that there is a fast algorithm that embeds hierarchical structures in three-dimensional Minkowski spacetime.<n>Our results seem to indicate that all discrete data has a perfect geometrical representation that is three-dimensional.
arXiv Detail & Related papers (2025-05-07T20:41:06Z) - ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval [64.44265315244579]
We propose a tree-based method for organizing and representing reference documents at various granular levels.
Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches.
Our evaluations show that ReTreever generally preserves full representation accuracy.
arXiv Detail & Related papers (2025-02-11T21:35:13Z) - Data-driven Coreference-based Ontology Building [48.995395445597225]
Coreference resolution is traditionally used as a component in individual document understanding.
We take a more global view and explore what can we learn about a domain from the set of all document-level coreference relations.
We release the coreference chains resulting under a creative-commons license, along with the code.
arXiv Detail & Related papers (2024-10-22T14:30:40Z) - Integrating Hierarchical Semantic into Iterative Generation Model for Entailment Tree Explanation [7.5496857647335585]
We propose an architecture of integrating the Hierarchical Semantics of sentences under the framework of Controller-Generator (HiSCG) to explain answers.
The proposed method achieves comparable performance on all three settings of the EntailmentBank dataset.
arXiv Detail & Related papers (2024-09-26T11:46:58Z) - Hierarchical Clustering via Local Search [0.0]
We introduce a local search algorithm for hierarchical clustering.
We show that any locally optimal tree guarantees a revenue of at least $fracn-23sum_i jw(i,j)$ where is $n$ the number of objects and $w: [n] times [n] mathbbR+$ is the associated similarity function.
arXiv Detail & Related papers (2024-05-24T23:46:24Z) - HetTree: Heterogeneous Tree Graph Neural Network [12.403166161903378]
HetTree is a novel heterogeneous tree graph neural network that models both the graph structure and heterogeneous aspects.
HetTree builds a semantic tree data structure to capture the hierarchy among metapaths.
Our evaluation of HetTree on a variety of real-world datasets demonstrates that it outperforms all existing baselines on open benchmarks.
arXiv Detail & Related papers (2024-02-21T03:14:45Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Type-supervised sequence labeling based on the heterogeneous star graph
for named entity recognition [6.25916397918329]
The representation learning of the heterogeneous star graph containing text nodes and type nodes is investigated in this paper.
The model performs the type-supervised sequence labeling after updating nodes in the graph.
Experiments on public NER datasets reveal the effectiveness of our model in extracting both flat and nested entities.
arXiv Detail & Related papers (2022-10-19T01:40:06Z) - A Tree-structured Transformer for Program Representation Learning [27.31416015946351]
Long-term/global dependencies widely exist in programs, and most neural networks fail to capture these dependencies.
In this paper, we propose Tree-Transformer, a novel tree-structured neural network which aims to overcome the above limitations.
By combining bottom-up and top-down propagation, Tree-Transformer can learn both global contexts and meaningful node features.
arXiv Detail & Related papers (2022-08-18T05:42:01Z) - Modeling Heterogeneous Hierarchies with Relation-specific Hyperbolic
Cones [64.75766944882389]
We present ConE (Cone Embedding), a KG embedding model that is able to simultaneously model multiple hierarchical as well as non-hierarchical relations in a knowledge graph.
In particular, ConE uses cone containment constraints in different subspaces of the hyperbolic embedding space to capture multiple heterogeneous hierarchies.
Our approach yields new state-of-the-art Hits@1 of 45.3% on WN18RR and 16.1% on DDB14 (0.231 MRR)
arXiv Detail & Related papers (2021-10-28T07:16:08Z) - Tree Decomposition Attention for AMR-to-Text Generation [12.342043849587613]
We use a graph's tree decomposition to constrain self-attention in a graph.
We apply dynamic programming to derive a forest of tree decompositions, choosing the most structurally similar tree to the AMR.
Our system outperforms a self-attentive baseline by 1.6 BLEU and 1.8 chrF++.
arXiv Detail & Related papers (2021-08-27T14:24:25Z) - Exploring the Hierarchy in Relation Labels for Scene Graph Generation [75.88758055269948]
The proposed method can improve several state-of-the-art baselines by a large margin (up to $33%$ relative gain) in terms of Recall@50.
Experiments show that the proposed simple yet effective method can improve several state-of-the-art baselines by a large margin.
arXiv Detail & Related papers (2020-09-12T17:36:53Z) - Graph Neural Networks with Composite Kernels [60.81504431653264]
We re-interpret node aggregation from the perspective of kernel weighting.
We present a framework to consider feature similarity in an aggregation scheme.
We propose feature aggregation as the composition of the original neighbor-based kernel and a learnable kernel to encode feature similarities in a feature space.
arXiv Detail & Related papers (2020-05-16T04:44:29Z) - Data Structures & Algorithms for Exact Inference in Hierarchical
Clustering [41.24805506595378]
We present novel dynamic-programming algorithms for emphexact inference in hierarchical clustering based on a novel trellis data structure.
Our algorithms scale in time and space proportional to the powerset of $N$ elements which is super-exponentially more efficient than explicitly considering each of the (2N-3)!! possible hierarchies.
arXiv Detail & Related papers (2020-02-26T17:43:53Z) - Graph Inference Learning for Semi-supervised Classification [50.55765399527556]
We propose a Graph Inference Learning framework to boost the performance of semi-supervised node classification.
For learning the inference process, we introduce meta-optimization on structure relations from training nodes to validation nodes.
Comprehensive evaluations on four benchmark datasets demonstrate the superiority of our proposed GIL when compared against state-of-the-art methods.
arXiv Detail & Related papers (2020-01-17T02:52:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.