ChatUIE: Exploring Chat-based Unified Information Extraction using Large
Language Models
- URL: http://arxiv.org/abs/2403.05132v1
- Date: Fri, 8 Mar 2024 07:59:19 GMT
- Title: ChatUIE: Exploring Chat-based Unified Information Extraction using Large
Language Models
- Authors: Jun Xu, Mengshu Sun, Zhiqiang Zhang and Jun Zhou
- Abstract summary: ChatUIE is an innovative unified information extraction framework built upon ChatGLM.
reinforcement learning is employed to improve and align various tasks that involve confusing and limited samples.
Our experimental results demonstrate that ChatUIE can significantly improve the performance of information extraction with a slight decrease in chatting ability.
- Score: 15.72709558213362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in large language models have shown impressive
performance in general chat. However, their domain-specific capabilities,
particularly in information extraction, have certain limitations. Extracting
structured information from natural language that deviates from known schemas
or instructions has proven challenging for previous prompt-based methods. This
motivated us to explore domain-specific modeling in chat-based language models
as a solution for extracting structured information from natural language. In
this paper, we present ChatUIE, an innovative unified information extraction
framework built upon ChatGLM. Simultaneously, reinforcement learning is
employed to improve and align various tasks that involve confusing and limited
samples. Furthermore, we integrate generation constraints to address the issue
of generating elements that are not present in the input. Our experimental
results demonstrate that ChatUIE can significantly improve the performance of
information extraction with a slight decrease in chatting ability.
Related papers
- Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception [63.03288425612792]
We propose bfAnyRef, a general MLLM model that can generate pixel-wise object perceptions and natural language descriptions from multi-modality references.
Our model achieves state-of-the-art results across multiple benchmarks, including diverse modality referring segmentation and region-level referring expression generation.
arXiv Detail & Related papers (2024-03-05T13:45:46Z) - Less is More: A Closer Look at Semantic-based Few-Shot Learning [11.724194320966959]
Few-shot Learning aims to learn and distinguish new categories with a very limited number of available images.
We propose a simple but effective framework for few-shot learning tasks, specifically designed to exploit the textual information and language model.
Our experiments conducted across four widely used few-shot datasets demonstrate that our simple framework achieves impressive results.
arXiv Detail & Related papers (2024-01-10T08:56:02Z) - Evaluating Large Language Models in Semantic Parsing for Conversational
Question Answering over Knowledge Graphs [6.869834883252353]
This paper evaluates the performance of large language models that have not been explicitly pre-trained on this task.
Our results demonstrate that large language models are capable of generating graph queries from dialogues.
arXiv Detail & Related papers (2024-01-03T12:28:33Z) - YAYI-UIE: A Chat-Enhanced Instruction Tuning Framework for Universal Information Extraction [20.32778991187863]
We propose an end-to-end chat-enhanced instruction tuning framework for universal information extraction (YAYI-UIE)
Specifically, we utilize dialogue data and information extraction data to enhance the information extraction performance jointly.
arXiv Detail & Related papers (2023-12-24T21:33:03Z) - Diffusion Language Models Can Perform Many Tasks with Scaling and
Instruction-Finetuning [56.03057119008865]
We show that scaling diffusion language models can effectively make them strong language learners.
We build competent diffusion language models at scale by first acquiring knowledge from massive data.
Experiments show that scaling diffusion language models consistently improves performance across downstream language tasks.
arXiv Detail & Related papers (2023-08-23T16:01:12Z) - RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling.
Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training.
Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z) - SINC: Self-Supervised In-Context Learning for Vision-Language Tasks [64.44336003123102]
We propose a framework to enable in-context learning in large language models.
A meta-model can learn on self-supervised prompts consisting of tailored demonstrations.
Experiments show that SINC outperforms gradient-based methods in various vision-language tasks.
arXiv Detail & Related papers (2023-07-15T08:33:08Z) - A Cloud-based Machine Learning Pipeline for the Efficient Extraction of
Insights from Customer Reviews [0.0]
We present a cloud-based system that can extract insights from customer reviews using machine learning methods integrated into a pipeline.
For topic modeling, our composite model uses transformer-based neural networks designed for natural language processing.
Our system can achieve better results than this task's existing topic modeling and keyword extraction solutions.
arXiv Detail & Related papers (2023-06-13T14:07:52Z) - Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph
Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction.
RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z) - Internet-augmented language models through few-shot prompting for
open-domain question answering [6.573232954655063]
We capitalize on the unique few-shot capabilities offered by large-scale language models to overcome some of their challenges.
We use few-shot prompting to learn to condition language models on information returned from the web using Google Search.
We find that language models conditioned on the web surpass performance of closed-book models of similar, or even larger, model sizes in open-domain question answering.
arXiv Detail & Related papers (2022-03-10T02:24:14Z) - Towards Zero-shot Language Modeling [90.80124496312274]
We construct a neural model that is inductively biased towards learning human languages.
We infer this distribution from a sample of typologically diverse training languages.
We harness additional language-specific side information as distant supervision for held-out languages.
arXiv Detail & Related papers (2021-08-06T23:49:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.