CoarSAS2hvec: Heterogeneous Information Network Embedding with Balanced
Network Sampling
- URL: http://arxiv.org/abs/2110.05820v1
- Date: Tue, 12 Oct 2021 08:34:39 GMT
- Title: CoarSAS2hvec: Heterogeneous Information Network Embedding with Balanced
Network Sampling
- Authors: Ling Zhan, Tao Jia
- Abstract summary: Heterogeneous information network (HIN) embedding aims to find the representations of nodes that preserve the proximity between entities of different nature.
A family of approaches that are wildly adopted applies random walk to generate a sequence of heterogeneous context.
Due to the multipartite graph structure of HIN, hub nodes tend to be over-represented in the sampled sequence, giving rise to imbalanced samples of the network.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Heterogeneous information network (HIN) embedding aims to find the
representations of nodes that preserve the proximity between entities of
different nature. A family of approaches that are wildly adopted applies random
walk to generate a sequence of heterogeneous context, from which the embedding
is learned. However, due to the multipartite graph structure of HIN, hub nodes
tend to be over-represented in the sampled sequence, giving rise to imbalanced
samples of the network. Here we propose a new embedding method CoarSAS2hvec.
The self-avoid short sequence sampling with the HIN coarsening procedure
(CoarSAS) is utilized to better collect the rich information in HIN. An
optimized loss function is used to improve the performance of the HIN structure
embedding. CoarSAS2hvec outperforms nine other methods in two different tasks
on four real-world data sets. The ablation study confirms that the samples
collected by CoarSAS contain richer information of the network compared with
those by other methods, which is characterized by a higher information entropy.
Hence, the traditional loss function applied to samples by CoarSAS can also
yield improved results. Our work addresses a limitation of the
random-walk-based HIN embedding that has not been emphasized before, which can
shed light on a range of problems in HIN analyses.
Related papers
- Fast Scalable and Accurate Discovery of DAGs Using the Best Order Score
Search and Grow-Shrink Trees [2.667401221288548]
Best order score search (BOSS) and grow-shrink trees ( GSTs) for learning directed acyclic graphs (DAGs)
We introduce the best order score search (BOSS) and grow-shrink trees ( GSTs) for learning directed acyclic graphs (DAGs)
arXiv Detail & Related papers (2023-10-26T10:03:12Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - SHRIMP: Sparser Random Feature Models via Iterative Magnitude Pruning [3.775565013663731]
We propose a new method -- Sparser Random Feature Models via IMP (ShRIMP) -- to efficiently fit high-dimensional data with inherent low-dimensional structure.
Our method can be viewed as a combined process to construct and find sparse lottery tickets for two-layer dense networks.
arXiv Detail & Related papers (2021-12-07T21:32:28Z) - Deep Archimedean Copulas [98.96141706464425]
ACNet is a novel differentiable neural network architecture that enforces structural properties.
We show that ACNet is able to both approximate common Archimedean Copulas and generate new copulas which may provide better fits to data.
arXiv Detail & Related papers (2020-12-05T22:58:37Z) - MG-GCN: Fast and Effective Learning with Mix-grained Aggregators for
Training Large Graph Convolutional Networks [20.07942308916373]
Graph convolutional networks (GCNs) generate the embeddings of nodes by aggregating the information of their neighbors layer by layer.
The high computational and memory cost of GCNs makes it infeasible for training on large graphs.
A new model, named Mix-grained GCN (MG-GCN), achieves state-of-the-art performance in terms of accuracy, training speed, convergence speed, and memory cost.
arXiv Detail & Related papers (2020-11-17T14:51:57Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Improving Generative Adversarial Networks with Local Coordinate Coding [150.24880482480455]
Generative adversarial networks (GANs) have shown remarkable success in generating realistic data from some predefined prior distribution.
In practice, semantic information might be represented by some latent distribution learned from data.
We propose an LCCGAN model with local coordinate coding (LCC) to improve the performance of generating data.
arXiv Detail & Related papers (2020-07-28T09:17:50Z) - Ensembled sparse-input hierarchical networks for high-dimensional
datasets [8.629912408966145]
We show that dense neural networks can be a practical data analysis tool in settings with small sample sizes.
A proposed method appropriately prunes the network structure by tuning only two L1-penalty parameters.
On a collection of real-world datasets with different sizes, EASIER-net selected network architectures in a data-adaptive manner and achieved higher prediction accuracy than off-the-shelf methods on average.
arXiv Detail & Related papers (2020-05-11T02:08:53Z) - Recent Developments Combining Ensemble Smoother and Deep Generative
Networks for Facies History Matching [58.720142291102135]
This research project focuses on the use of autoencoders networks to construct a continuous parameterization for facies models.
We benchmark seven different formulations, including VAE, generative adversarial network (GAN), Wasserstein GAN, variational auto-encoding GAN, principal component analysis (PCA) with cycle GAN, PCA with transfer style network and VAE with style loss.
arXiv Detail & Related papers (2020-05-08T21:32:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.