Swift and Sure: Hardness-aware Contrastive Learning for Low-dimensional
Knowledge Graph Embeddings
- URL: http://arxiv.org/abs/2201.00565v1
- Date: Mon, 3 Jan 2022 10:25:10 GMT
- Title: Swift and Sure: Hardness-aware Contrastive Learning for Low-dimensional
Knowledge Graph Embeddings
- Authors: Kai Wang and Yu Liu and Quan Z. Sheng
- Abstract summary: We propose a novel KGE training framework called Hardness-aware Low-dimensional Embedding (HaLE)
In the limited training time, HaLE can effectively improve the performance and training speed of KGE models.
The HaLE-trained models can obtain a high prediction accuracy after training few minutes and are competitive compared to the state-of-the-art models.
- Score: 20.693275018860287
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge graph embedding (KGE) has drawn great attention due to its
potential in automatic knowledge graph (KG) completion and knowledge-driven
tasks. However, recent KGE models suffer from high training cost and large
storage space, thus limiting their practicality in real-world applications. To
address this challenge, based on the latest findings in the field of
Contrastive Learning, we propose a novel KGE training framework called
Hardness-aware Low-dimensional Embedding (HaLE). Instead of the traditional
Negative Sampling, we design a new loss function based on query sampling that
can balance two important training targets, Alignment and Uniformity.
Furthermore, we analyze the hardness-aware ability of recent low-dimensional
hyperbolic models and propose a lightweight hardness-aware activation
mechanism, which can help the KGE models focus on hard instances and speed up
convergence. The experimental results show that in the limited training time,
HaLE can effectively improve the performance and training speed of KGE models
on five commonly-used datasets. The HaLE-trained models can obtain a high
prediction accuracy after training few minutes and are competitive compared to
the state-of-the-art models in both low- and high-dimensional conditions.
Related papers
- Croppable Knowledge Graph Embedding [34.154096023765916]
Knowledge Graph Embedding (KGE) is a common method for Knowledge Graphs (KGs) to serve various artificial intelligence tasks.
Once a new dimension is required, a new KGE model needs to be trained from scratch.
We propose a novel KGE training framework MED, through which we could train once to get a croppable KGE model applicable to multiple scenarios.
arXiv Detail & Related papers (2024-07-03T03:10:25Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - An Emulator for Fine-Tuning Large Language Models using Small Language
Models [91.02498576056057]
We introduce emulated fine-tuning (EFT), a principled and practical method for sampling from a distribution that approximates the result of pre-training and fine-tuning at different scales.
We show that EFT enables test-time adjustment of competing behavioral traits like helpfulness and harmlessness without additional training.
Finally, a special case of emulated fine-tuning, which we call LM up-scaling, avoids resource-intensive fine-tuning of large pre-trained models by ensembling them with small fine-tuned models.
arXiv Detail & Related papers (2023-10-19T17:57:16Z) - Accurate Neural Network Pruning Requires Rethinking Sparse Optimization [87.90654868505518]
We show the impact of high sparsity on model training using the standard computer vision and natural language processing sparsity benchmarks.
We provide new approaches for mitigating this issue for both sparse pre-training of vision models and sparse fine-tuning of language models.
arXiv Detail & Related papers (2023-08-03T21:49:14Z) - Continual Learners are Incremental Model Generalizers [70.34479702177988]
This paper extensively studies the impact of Continual Learning (CL) models as pre-trainers.
We find that the transfer quality of the representation often increases gradually without noticeable degradation in fine-tuning performance.
We propose a new fine-tuning scheme, GLobal Attention Discretization (GLAD), that preserves rich task-generic representation during solving downstream tasks.
arXiv Detail & Related papers (2023-06-21T05:26:28Z) - Retrieval-Enhanced Contrastive Vision-Text Models [61.783728119255365]
We propose to equip vision-text models with the ability to refine their embedding with cross-modal retrieved information from a memory at inference time.
Remarkably, we show that this can be done with a light-weight, single-layer, fusion transformer on top of a frozen CLIP.
Our experiments validate that our retrieval-enhanced contrastive (RECO) training improves CLIP performance substantially on several challenging fine-grained tasks.
arXiv Detail & Related papers (2023-06-12T15:52:02Z) - Rethinking Soft Label in Label Distribution Learning Perspective [0.27719338074999533]
The primary goal of training in early convolutional neural networks (CNN) is the higher generalization performance of the model.
We investigated that performing label distribution learning (LDL) would enhance the model calibration in CNN training.
We performed several visualizations and analyses and witnessed several interesting behaviors in CNN training with the LDL.
arXiv Detail & Related papers (2023-01-31T06:47:19Z) - Confidence-aware Self-Semantic Distillation on Knowledge Graph Embedding [20.49583906923656]
Confidence-aware Self-Knowledge Distillation learns from the model itself to enhance KGE in a low-dimensional space.
A specific semantic module is developed to filter reliable knowledge by estimating the confidence of previously learned embeddings.
arXiv Detail & Related papers (2022-06-07T01:49:22Z) - Highly Efficient Knowledge Graph Embedding Learning with Orthogonal
Procrustes Analysis [10.154836127889487]
Knowledge Graph Embeddings (KGEs) have been intensively explored in recent years due to their promise for a wide range of applications.
This paper proposes a simple yet effective KGE framework which can reduce the training time and carbon footprint by orders of magnitudes.
arXiv Detail & Related papers (2021-04-10T03:55:45Z) - MixKD: Towards Efficient Distillation of Large-scale Language Models [129.73786264834894]
We propose MixKD, a data-agnostic distillation framework, to endow the resulting model with stronger generalization ability.
We prove from a theoretical perspective that under reasonable conditions MixKD gives rise to a smaller gap between the error and the empirical error.
Experiments under a limited-data setting and ablation studies further demonstrate the advantages of the proposed approach.
arXiv Detail & Related papers (2020-11-01T18:47:51Z) - MulDE: Multi-teacher Knowledge Distillation for Low-dimensional
Knowledge Graph Embeddings [22.159452429209463]
Link prediction based on knowledge graph embeddings (KGE) aims to predict new triples to automatically construct knowledge graphs (KGs)
Recent KGE models achieve performance improvements by excessively increasing the embedding dimensions.
We propose MulDE, a novel knowledge distillation framework, which includes multiple low-dimensional hyperbolic KGE models as teachers and two student components.
arXiv Detail & Related papers (2020-10-14T15:09:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.