TechGPT-2.0: A large language model project to solve the task of
knowledge graph construction
- URL: http://arxiv.org/abs/2401.04507v1
- Date: Tue, 9 Jan 2024 11:52:58 GMT
- Title: TechGPT-2.0: A large language model project to solve the task of
knowledge graph construction
- Authors: Jiaqi Wang, Yuying Chang, Zhong Li, Ning An, Qi Ma, Lei Hei, Haibo
Luo, Yifei Lu, Feiliang Ren
- Abstract summary: TechGPT-2.0 is a project designed to enhance the capabilities of large language models in knowledge graph construction tasks.
It exhibits robust text processing capabilities, particularly in the domains of medicine and law.
TechGPT-2.0 is trained on Huawei's Ascend server.
- Score: 31.638140593358433
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models have exhibited robust performance across diverse
natural language processing tasks. This report introduces TechGPT-2.0, a
project designed to enhance the capabilities of large language models
specifically in knowledge graph construction tasks, including named entity
recognition (NER) and relationship triple extraction (RTE) tasks in NLP
applications. Additionally, it serves as a LLM accessible for research within
the Chinese open-source model community. We offer two 7B large language model
weights and a QLoRA weight specialized for processing lengthy texts.Notably,
TechGPT-2.0 is trained on Huawei's Ascend server. Inheriting all
functionalities from TechGPT-1.0, it exhibits robust text processing
capabilities, particularly in the domains of medicine and law. Furthermore, we
introduce new capabilities to the model, enabling it to process texts in
various domains such as geographical areas, transportation, organizations,
literary works, biology, natural sciences, astronomical objects, and
architecture. These enhancements also fortified the model's adeptness in
handling hallucinations, unanswerable queries, and lengthy texts. This report
provides a comprehensive and detailed introduction to the full fine-tuning
process on Huawei's Ascend servers, encompassing experiences in Ascend server
debugging, instruction fine-tuning data processing, and model training. Our
code is available at https://github.com/neukg/TechGPT-2.0
Related papers
- In-Context Code-Text Learning for Bimodal Software Engineering [26.0027882745058]
Bimodal software analysis initially appeared to be within reach with the advent of large language models.
We postulate that in-context learning for the code-text bimodality is a promising avenue.
We consider a diverse dataset encompassing 23 software engineering tasks, which we transform in an in-context learning format.
arXiv Detail & Related papers (2024-10-08T19:42:00Z) - GemmAr: Enhancing LLMs Through Arabic Instruction-Tuning [0.0]
We introduce InstAr-500k, a new Arabic instruction dataset created by generating and collecting content.
We assess this dataset by fine-tuning an open-source Gemma-7B model on several downstream tasks to improve its functionality.
Based on multiple evaluations, our fine-tuned model achieves excellent performance on several Arabic NLP benchmarks.
arXiv Detail & Related papers (2024-07-02T10:43:49Z) - CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models [59.91221728187576]
This paper introduces the CMU Linguistic Linguistic Backend, an open-source framework that simplifies model deployment and continuous human-in-the-loop fine-tuning of NLP models.
CMULAB enables users to leverage the power of multilingual models to quickly adapt and extend existing tools for speech recognition, OCR, translation, and syntactic analysis to new languages.
arXiv Detail & Related papers (2024-04-03T02:21:46Z) - Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets [2.8123257987021058]
We focus on enhancing the LLaMA-2-Amharic model by integrating task-specific and generative datasets.
We compile an Amharic instruction fine-tuning dataset and fine-tuned LLaMA-2-Amharic model.
The fine-tuned model shows promising results in different NLP tasks.
arXiv Detail & Related papers (2024-02-12T19:25:11Z) - SoTaNa: The Open-Source Software Development Assistant [81.86136560157266]
SoTaNa is an open-source software development assistant.
It generates high-quality instruction-based data for the domain of software engineering.
It employs a parameter-efficient fine-tuning approach to enhance the open-source foundation model, LLaMA.
arXiv Detail & Related papers (2023-08-25T14:56:21Z) - Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation
from Text [2.396908230113859]
Large language models (LLM) and foundation models with emergent capabilities have been shown to improve the performance of many NLP tasks.
We present Text2KGBench, a benchmark to evaluate the capabilities of language models to generate Knowledge Graphs (KGs) from natural language text guided by an ontology.
arXiv Detail & Related papers (2023-08-04T14:47:15Z) - Harnessing Explanations: LLM-to-LM Interpreter for Enhanced
Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks.
Our method achieves state-of-the-art results on well-established TAG datasets.
Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z) - Deep Bidirectional Language-Knowledge Graph Pretraining [159.9645181522436]
DRAGON is a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale.
Our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities.
arXiv Detail & Related papers (2022-10-17T18:02:52Z) - TegTok: Augmenting Text Generation via Task-specific and Open-world
Knowledge [83.55215993730326]
We propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework.
Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively.
arXiv Detail & Related papers (2022-03-16T10:37:59Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.